Skip to content

Fix bug with dependency bigrams (with lemmas and lowercasing enabled)#1

Open
prrao87 wants to merge 3 commits intokristopherkyle:masterfrom
prrao87:master
Open

Fix bug with dependency bigrams (with lemmas and lowercasing enabled)#1
prrao87 wants to merge 3 commits intokristopherkyle:masterfrom
prrao87:master

Conversation

@prrao87
Copy link
Copy Markdown

@prrao87 prrao87 commented Aug 6, 2020

Hey Kris,

First of all, thanks so much for writing such a nice, easy-to-use tool for corpus analysis! I've been using it in some work I'm doing on studying gendered language in a news corpus, and it's been proving very useful so far. I especially like the dependency bigrams functionality - it's helping me a lot in interpreting linguistic content (far more than say, keyness or n-grams) in my specific corpus.

Bug fix

I did notice that there was a bug when I try to run the dependency bigram function with both lemmas and lowercasing enabled. Instead of lowercasing the text once we obtain the lemma, it was simply returning the lemma regardless of whether or not the lower=True keyword was specified. The fix was simple - it's just an additional indented if-block as can be seen in the commit diff.

For example, I expect that when lower=True and lemma=True are both enabled, we should see something like this:

cup_win
canada_represent

However, with the current version, we get this:

Cup_win
Canada_represent

Testing

I tested the fix out on multiple cases, and the lower=True keyword argument now works as intended. I just pushed the fix to the dev directory for now - let me know if this works or if you want it pushed to the main directory as well.

Thanks again for making this - I'm sure it'll be a useful resource for others in the field too. Cheers!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant