Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Unfortunately I did this for work so I can't open-source the code without some awkward conversations (which may be possible and worth it, but I haven't yet). I'm sure I could write a blog post on it, but I don't have a blog, so ... sorry. Peter Norvig's post plus my point about tries and an efficient comparison algorithm at least give you a head start.


Norvig wrote a similar expansion in his chapter for the book Beautiful Data (along with several other small programs to do fun things with natural language corpus data).

You can find it with a web search.

(His version didn't use a trie, because Python's built-in dicts are much more efficient than a trie in Python, even with the extra redundancy that a trie eliminates.)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: