Skip to content

Add Python implementations for LeetCode 155 and 705#2629

Open
rishigoswamy wants to merge 1 commit intosuper30admin:masterfrom
rishigoswamy:master
Open

Add Python implementations for LeetCode 155 and 705#2629
rishigoswamy wants to merge 1 commit intosuper30admin:masterfrom
rishigoswamy:master

Conversation

@rishigoswamy
Copy link

No description provided.

@rishigoswamy rishigoswamy reopened this Feb 4, 2026
@super30admin
Copy link
Owner

Strengths:

  • The solution correctly implements the HashSet operations with constant time complexity.
  • The approach of using a 2D array with direct addressing is appropriate for the given constraints.
  • The code is well-structured and readable, with comments explaining the approach.

Areas for Improvement:

  1. Memory Efficiency: The current implementation pre-allocates all secondary arrays (1000 arrays of size 1001). This uses about 1e6 booleans, which is acceptable but not memory-efficient for small sets. Consider lazy initialization: only create a secondary array when a key is added to that primary bucket. This would reduce memory usage when the number of elements is small.
  2. Redundant Calculation: The hashKeys method is called twice in each operation (add/remove/contains), which recalculates the same hashes. You can compute the hashes once and reuse them.
  3. Code Clarity: The hashKeys method returns a list, which is then indexed. It might be clearer to return a tuple or simply compute both hashes in the method and use them directly without a helper function. Alternatively, you can inline the hash calculations to avoid function call overhead.

Suggested Improvements:

  • For lazy initialization, in the add method, check if the secondary array for the primary index exists. If not, create it. This is similar to the reference solution in Java.
  • To avoid redundant calculations, compute the primary and secondary indices once per operation and store them in local variables.

Example of improved code for add:

def add(self, key: int) -> None:
    primary = key % self.primaryArraySize
    secondary = key // self.primaryArraySize
    if self.table[primary] is None:
        # Initialize the secondary array only when needed
        self.table[primary] = [False] * self.secondaryArraySize
    self.table[primary][secondary] = True

But note: in the current student code, the entire table is pre-allocated with lists of booleans. So to implement lazy initialization, you should initialize self.table with None for each primary bucket, and then allocate the secondary array on demand.

Also, for the key 0, the secondary index is 0. For key 1000000, primary=0 and secondary=1000. So you need to ensure the secondary array for primary=0 has size 1001 (to include index 1000). For other primary buckets, the maximum secondary index is 999 (for key 999999: primary=999, secondary=999). So you can allocate size 1000 for primary buckets 1 to 999, and size 1001 for primary bucket 0. This is handled in the reference solution.

Overall, the student's solution is correct and efficient in time, but could be improved in memory usage.

@super30admin
Copy link
Owner

Your solution is well thought out and correctly implements the HashSet operations with constant time complexity. The two-level hashing approach is appropriate for the problem constraints.

One area for improvement is space efficiency. Currently, you allocate the entire 2D array upfront, which uses about 1 million boolean values (each boolean is typically 1 byte, so about 1 MB). This is acceptable given the constraints (key range up to 10^6 and at most 10^4 operations), but it could be optimized. The reference solution uses lazy initialization: it only creates the secondary arrays when a key in that primary bucket is added for the first time. This reduces the space usage when the number of keys is small.

For example, in the reference solution:

  • The primary array is initialized to size 1000, but all elements are initially null.
  • When a key is added, the primary index is computed. If the secondary array for that primary index is null, it is created only then. Additionally, for the primary index 0, the secondary array is created with size 1001 to handle the key 1000000 (since 1000000 % 1000 = 0 and 1000000 / 1000 = 1000), while other primary indices use arrays of size 1000.

You could consider modifying your solution to use lazy initialization to save space. However, your current solution is correct and efficient enough for the problem constraints.

Another minor point: in the hashKeys method, you compute the indices twice in each of the add, remove, and contains methods. You could compute them once and store them in local variables to avoid redundant calculations. For example:

def add(self, key: int) -> None:
    primaryKey, secondaryKey = self.hashKeys(key)
    # ... rest of the code

Overall, your solution is correct and efficient. Well done!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants