Skip to content

CONFIG dictionaries stuff #7

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
huguesdevimeux opened this issue May 19, 2020 · 68 comments
Closed

CONFIG dictionaries stuff #7

huguesdevimeux opened this issue May 19, 2020 · 68 comments
Assignees
Labels
enhancement Additions and improvements in general refactor Refactor or redesign of existing code

Comments

@huguesdevimeux
Copy link
Member

We have discussed about the variable CONFIG previously on discord.
What should we do about it ?

I personally think we should modify at least a little bit of its architecture. it alters the readability a lot.
One problem with that would be that in changing CONFIG we would have to review the entire architecture of the code.

What do you think?

@leotrs
Copy link
Contributor

leotrs commented May 19, 2020

I agree it's not the most elegant piece of the code. I'd love to hear some suggestions here. I have some experience with the configparser module from the standard library.

@leotrs
Copy link
Contributor

leotrs commented May 19, 2020

A related issue is the constants.py file. Both this file and the CONFIG dict are basically ways of tracking hard-coded options. One nice-to-have would be to expose this options and allow the user to change or visualize them more easily. This would be achieved by the config files used by configparser.

@leotrs leotrs added the enhancement Additions and improvements in general label May 19, 2020
@eulertour
Copy link
Member

The CONFIG and constants.py issues can be handled separately, although we should do both. One of the confusing things about constants.py is that some of the variables defined there aren't actually constants (e.g. the media directories).

As for the CONFIG variables in each class I think it'd be cleaner to just add them as keyword arguments. We could leave the final **kwargs argument to accept additional config variables for backwards compatibility. For example Circle.__init__ here would be something like

def __init__(self, color=RED, close_new_points=True, anchors_span_full_range=False, **kwargs):
    self.color = color
    self.close_new_points = close_new_points
    self.anchors_span_full_range = anchors_span_full_range
    self.__dict__.update(kwargs)

@leotrs
Copy link
Contributor

leotrs commented May 19, 2020

Is it always the case that doing self.CONFIG['key'] would be equivalent to doing self.key instead? for example, are the keys of the CONFIG dictionary iterated over?

Another thing to think about is: which of the things currently stored in CONFIG should be turned into properties.

@eulertour
Copy link
Member

Is it always the case that doing self.CONFIG['key'] would be equivalent to doing self.key instead? for example, are the keys of the CONFIG dictionary iterated over?

Yes, but the code is pretty complicated.
https://github.com/ManimCommunity/manim/blob/master/manimlib/utils/config_ops.py#L23

@leotrs
Copy link
Contributor

leotrs commented May 19, 2020

If that's the case, then I'd vote against converting each key of CONFIG into a member of the class. Because once they're just another member, there is no way of telling apart whether a member is supposed to be one that would go in CONFIG. Said another way, you can't iterate over only the members that are supposed to be CONFIGs without iterating over a bunch of members that are not.

@PgBiel
Copy link
Member

PgBiel commented May 19, 2020

If that's the case, then I'd vote against converting each key of CONFIG into a member of the class. Because once they're just another member, there is no way of telling apart whether a member is supposed to be one that would go in CONFIG. Said another way, you can't iterate over only the members that are supposed to be CONFIGs without iterating over a bunch of members that are not.

use __slots__ for this

@leotrs
Copy link
Contributor

leotrs commented May 19, 2020

Oooooooooooooh good one

@yoshiask
Copy link
Contributor

As someone who's most familiar with C#, the CONFIG is a big mess to me. There's no indication of which parameters are available and which ones aren't. kwargs has the same issue. This may just be a gripe with Python in general, but I think Manim is complicated enough without introducing unknown parameters.

@yoshiask
Copy link
Contributor

I prefer @eulertour's proposal

As for the CONFIG variables in each class I think it'd be cleaner to just add them as keyword arguments. We could leave the final **kwargs argument to accept additional config variables for backwards compatibility.

@eulertour
Copy link
Member

If that's the case, then I'd vote against converting each key of CONFIG into a member of the class. Because once they're just another member, there is no way of telling apart whether a member is supposed to be one that would go in CONFIG. Said another way, you can't iterate over only the members that are supposed to be CONFIGs without iterating over a bunch of members that are not.

@leotrs There is no difference between CONFIG variables and instance variables. They're one and the same.

@leotrs
Copy link
Contributor

leotrs commented May 19, 2020

@leotrs There is no difference between CONFIG variables and instance variables. They're one and the same.

Well shit. Let's use kwargs and be done with it.

@huguesdevimeux
Copy link
Member Author

huguesdevimeux commented May 20, 2020

There is a difference that is purely for readability purposes. As @yoshiask said, it put unknows parameters, which is not good in a quite complex library as Manim.
I’m completely in favor of @eulertour ‘point.

@PgBiel
Copy link
Member

PgBiel commented May 20, 2020

In my opinion, the usage of __slots__ makes it pretty clear to anyone who reads the code which attributes the class may have. So it will turn the code even more readable!

(Also note that __slots__ has memory advantages when many instances of a class are created, but this shouldn't be the main argument since it's pretty much irrelevant most of the time... but any improvement in this sense is welcome, of course)

@leotrs
Copy link
Contributor

leotrs commented May 20, 2020

I'm leaning toward __slots__ now.

@PgBiel
Copy link
Member

PgBiel commented May 20, 2020

An update on that.

As discussed on Discord, we should use a dataclass model for our classes, using packages such as dataclasses (from stdlib) and attrs.

Sample code:

from dataclasses import dataclass

@dataclass
class MyCoolClass:
  a: int = 5
  b: int = 6

# then
obj = MyCoolClass(a=5, b=6)
# or
obj = MyCoolClass(5, 6)

Note that it automatically generates an __init__ method with each attribute as a parameter (which can be used as a kwarg or not).

Also note that, with the attrs package, the code would be the exact same. The only thing that would change would be the name of the decorator:

import attr

@attr.s
class MyCoolClass:
  a: int = 5
  b: int = 6

# everything else is the same.

Pros of using attrs over dataclasses:

  1. Support for validators (verification functions) for individual attributes - note, however, that this is possible with dataclassesby using the __postinit__ method;
  2. Support for converters, which convert the attribute's value straight away. Note that this is the preferred way of converting attributes, since, with dataclasses, one would need to convert using __postinit__, which means there would be a timespan during which the attribute would have the wrong type, and only on the __postinit__ call would this be corrected. (Of course, it's not like said timespan is too big though, so we can mostly ignore this)
  3. Automatic creation of __slots__: Really, just write @attr.s(slots=True) and it creates the __slots__, which gives a memory performance boost when many instances of a class are created. (Otherwise, though, it's a mostly invisible difference)

Cons of using attrs:

  1. It is another dependency. However, unlike other manim deps, such as numpy and... God forbid... cairo, it is very unlikely to randomly error during installation, as it is a much simpler module.

What do y'all think?

@PgBiel
Copy link
Member

PgBiel commented May 20, 2020

^Also, by the way: Worth noting that, if we actually implement this, this could be part of a major manim rework, i.e., working on this would imply reorganizing the code and whatnot. So we must keep this in mind.

This would likely be positive, of course. But we will need cooperation.

@leotrs
Copy link
Contributor

leotrs commented May 20, 2020

I like the idea of using dataclasses/attrs. I am usually partial to staying within stdlib, but since dataclasses were modeled directly after attrs, I think I'm OK with using attrs.

@yoshiask what do you think about this?

Regarding this being part of a major rework. I propose this: whoever ends up working on this would need to touch a number of files (and should probably wait until we have some minimal tests in place #22), but I suggest that this PR does not include any refactoring beyond changing CONFIG for attrs/dataclasses. In writing this PR, the author will have a better view of what refactoring needs to be done in the future, but we can leave that for later, I think.

@yoshiask
Copy link
Contributor

Dataclasses and attrs looks pretty good to me! Unfortunately I can't do it now, but I can start switching everything over to our new structure towards the end of the month, and hopefully by then we'll have CI and stuff set up.

@leotrs
Copy link
Contributor

leotrs commented May 20, 2020

I think we have consensus among everyone who's participated here, except for @eulertour.

@PgBiel
Copy link
Member

PgBiel commented May 20, 2020

Dataclasses and attrs looks pretty good to me! Unfortunately I can't do it now, but I can start switching everything over to our new structure towards the end of the month, and hopefully by then we'll have CI and stuff set up.

I'd be more than willing to help here as well!
I'll assign our names in the issue for now.

@leotrs leotrs added this to the Initial Cleanup milestone May 20, 2020
@eulertour
Copy link
Member

Do dataclasses/attrs allow for positional arguments to __init__ and for optional **kwargs for example what's used here? I'm reluctant to make such a large change that breaks old code.

@kilacoda-old
Copy link
Contributor

kilacoda-old commented May 20, 2020

Yes, if you declare the variable with a type but don't assign it a value.

from dataclasses import dataclass

@dataclass
class A:
    a:int
    b:str
    c:list = []

This is the same as:

class A:
    def __init__(self,a:int,b:str,c:list=[])

@leotrs
Copy link
Contributor

leotrs commented May 20, 2020

Only when declared with a type? Would this make some of our code have type hints while the rest doesn't?

@kilacoda-old
Copy link
Contributor

In python you can only create unassigned variable by providing a type I think. Also, we can always keep the type to be Any if we don't know it.

@kilacoda-old
Copy link
Contributor

We're using attrs though and not dataclasses.

@PgBiel
Copy link
Member

PgBiel commented Aug 8, 2020

correct, kilacoda. We're not going to use dataclasses, but attrs, which supports most python versions, including python 2

@naveen521kk
Copy link
Member

Is dataclassess introduced in python 3.7 right?

Some updates, dataclassess is supported in python 3.6 but not as a default module instead can be installed as a dependency from pip. I had know about this which working poetry and rich causeing me trouble lol.

@huguesdevimeux
Copy link
Member Author

But uuh .. isn't the plan to use attrs instead of dataclasses ?

correct, kilacoda. We're not going to use dataclasses, but attrs, which supports most python versions, including python 2

@eulertour
Copy link
Member

I talked some with Grant and he's now planning to migrate toward using regular classes, so it'd be better for consistency if we did that. Is anyone opposed to that?

@leotrs
Copy link
Contributor

leotrs commented Oct 4, 2020

As opposed to dataclasses OR attrs?

@eulertour
Copy link
Member

My takeaway from #271 is that attrs don't fit our use case. Dataclasses might but we don't know what problems we might run into so it'd be safer to stick with vanilla python. If someone is willing to vouch for dataclasses then maybe it'd be worth going back and asking him about it, but my prediction is that it will have some quirk like attrs that makes it difficult to adapt.

@leotrs
Copy link
Contributor

leotrs commented Oct 4, 2020

In principle I tend to agree. There's some stuff about attrs use of __post__init that doesn't sit well with me, and it seems to me that it will be a problem when moving away from CONFIG.

Cc. @PgBiel

@cobordism
Copy link
Member

cobordism commented Oct 8, 2020

The thing that CONFIG does really nicely is it allows anyone to easily create derived classes with a few default values changed.

class myRedText(Text):
    CONFIG = {
        colour: "RED"
    }

(or whatever the correct syntax is here)
is a lot simpler than

class myRedText(Text):
    def __init__(self, *args, **kwargs):
        c = kwargs.pop("colour", "RED")
        super().__init__(self, args, colour=c, kwargs)

or sth like that. (don't criticise my syntax in a github comment :)

My question is, how do dataclasses impact my ability to quickly and easily create such derived classes?

and maybe red text isn't the best example. another example might be TextMobjectFromPresetString

@leotrs
Copy link
Contributor

leotrs commented Oct 8, 2020

I don't know about dataclasses, but. The example you mention is indeed very convenient for a dev who is defining derived classes. However, this happens extremely rarely compared to the number of times that a user needs to use a class, or read the documentation, or would like to use autocompletion. CONFIG dictionaries make all of these almost impossible. So it's about the trade off of convenient for the dev or convenient for the user.

@leotrs
Copy link
Contributor

leotrs commented Oct 8, 2020

Also, the latest point brought up is that it may be better to not use either dataclasses or CONFIG. We're waiting for @PgBiel input.

@cobordism
Copy link
Member

this happens extremely rarely

As someone who learned python by learning how to make manim videos and reading 3b1b code, I naturally assumed that this happens all the time! Indeed I have been doing a lot of this :D

But I am starting to learn that the 3b1b code is somewhat idiosyncratic and may not be the best guide.

@leotrs
Copy link
Contributor

leotrs commented Oct 8, 2020

As someone who learned python by learning how to make manim videos and reading 3b1b code

yikes

@PgBiel
Copy link
Member

PgBiel commented Oct 8, 2020

My takeaway from #271 is that attrs don't fit our use case. Dataclasses might but we don't know what problems we might run into so it'd be safer to stick with vanilla python. If someone is willing to vouch for dataclasses then maybe it'd be worth going back and asking him about it, but my prediction is that it will have some quirk like attrs that makes it difficult to adapt.

What is this quirk you speak about?

But, vanilla classes could work fine if we do it right. If you have any ideas about this, you could make some model or prototype or something so that I can better understand what you have in mind

@eulertour
Copy link
Member

@PgBiel The issues at the top of #271
#271 (comment)

@PgBiel
Copy link
Member

PgBiel commented Oct 14, 2020

Regarding the first issue, I'm yet to come upon it, so I'll take a look later

Regarding the second, I'm pretty sure there are ways to bypass that; specifying types again is only necessary because of how python handles types in class attrs: it adds them to clazz.__annotations__, and, if you don't specify a type, the attribute is not stored there. However, if we don't use the automatic type to annotation feature for those cases, simply using a = attr.ib(b) instead of the fancy a: fulltype = b might have it automatically infer that the type is the same as that of the attribute in the superclass (I'm yet to test this, but I believe this could work).

Anyway, we could work with vanilla classes. However, I believe we should keep attrs for specific classes. We should use it where convenient, and maybe in important classes or something (to be discussed later).

@eulertour eulertour mentioned this issue Oct 22, 2020
6 tasks
@leotrs
Copy link
Contributor

leotrs commented Oct 29, 2020

It seems to me that a lot of the confusion here was moving from the current implementation using CONFIG dicts directly to attrs/dataclasses.

Now I'm more inclined to first move to vanilla classes ASAP, and then decide which classes to implement using attrs. Now that #620 is merged, I can work on removing CONFIG entirely and using vanilla classes. Once that's done we can revisit which classes need to move to attrs.

Yes, it's more work, but I'm willing to do it.

@cobordism
Copy link
Member

cobordism commented Nov 30, 2020

Now I'm more inclined to first move to vanilla classes ASAP

This is now done in PR #783
The plan is to merge that just after release 0.1.1 - so that it will be included in the subsequent 0.2.0 but we have a whole month to clean up.

However, once 783 is merged we have to address the question of how we want to refactor this going forward. dataclasses? fastcore? ...
We may also need a different approach for those CONFIGs that used to be part of mobjects that a user might instantiate vs the CONFIGS that were in Scenes that a user was meant to subclass (like GraphScene for example). The former only requires keyword arguments to the constructor, the latter might require a custom __init__().

...Of course, why a GraphScene is a scene at all, as opposed to a Graph being a mobject is another question entirely ...

@leotrs
Copy link
Contributor

leotrs commented Nov 30, 2020

we have to address the question of how we want to refactor this going forward. dataclasses? fastcore? ...

The question is: what do our classes need right now? I'm not certain we necessarily have a problem that needs to be fixed right now. (Other than the following point you make, of course...)

We may also need a different approach for those CONFIGs that used to be part of mobjects that a user might instantiate vs the CONFIGS that were in Scenes that a user was meant to subclass (like GraphScene for example). The former only requires keyword arguments to the constructor, the latter might require a custom init().

I'm fairly certain that most (if not all) of the CONFIG keys of GraphScene can just be replaced by setting attributes at the top of the construct function. The code

class MyScene(GraphScene):
    CONFIG = {"x_min": -1}
    def construct(self):
        self.setup_axes()

Can be replaced with

class MyScene(GraphScene):
    def construct(self):
        self.x_min = -1
        self.setup_axes()

In fact, we could encourage use of Scene.setup() for this purpose.

...Of course, why a GraphScene is a scene at all, as opposed to a Graph being a mobject is another question entirely ...

I think we must answer this question before choosing if we need to migrate to any other class and/or how and when. What is the concrete problem that needs to be addressed after we remove CONFIG once and for all?

@behackl behackl modified the milestones: Release v0.1.1, Release v0.2.0 Dec 1, 2020
@cobordism
Copy link
Member

closed by PR #783

For discussions on future refactoring, please open new dedicated issues.

@ManimCommunity ManimCommunity locked as resolved and limited conversation to collaborators Dec 3, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement Additions and improvements in general refactor Refactor or redesign of existing code
Projects
None yet
Development

No branches or pull requests

10 participants