Implement self ref fk ordering#322
Implement self ref fk ordering#322ozer550 wants to merge 8 commits intolearningequality:release-v0.9.xfrom
Conversation
|
|
||
| return tasks | ||
|
|
||
|
|
There was a problem hiding this comment.
This looks pretty good, but I think it can be simplified a little more.
First, we know there's a chance of at least two loops:
- Your first loop, that goes over tasks and sets self-ref properties, and determines what needs further lookups
- Another loop to process the parent lookups
With the morango_ordering feature, we expect that a parent comes before a child in the stream, if the parent is being serialized. That means, in the first loop, we can eagerly check the cache for the parent. If the parent is not in the cache, we queue the child for further lookups. That means we're able to handle all children, whose parents are in the stream, in the first loop, minimizing the length of the second loop.
The lookup of parents in the store looks good, and setting those parents in the cache makes sense.
As written, the second loop goes over all cached parent IDs. This could be significantly larger than the remaining children queued by the first loop. In other words, looping over the remaining queued children directly should be simpler and because the cache is indexed by parent, we can easily lookup the parent order using the self_ref_fk_value already set on the child in the first loop. Altogether, this should eliminate the need for children_by_parent, and simplifies the external_parent_ids to the parent IDs pulled from the remaining queued children.
As it's written, the second loop will get slower and slower as the stream count increases. This is especially so since the cache isn't reset when we detect those 'checkpoints' in the stream where we go from one model to another. But simplifying the second loop as I mentioned will eliminate the former cache issue, leaving only the latter (to reset the cache at the appropriate time)
Summary
WIP
Short description
TODO
Reviewer guidance
If you PR has a significant size, give the reviewer some helpful remarks
Issues addressed
List the issues solved or partly solved by the PR
Documentation
If the PR has documentation, link the file here (either .rst in your repo or if built on Read The Docs)