Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

raion on V2EX[1][2] reverse-engineered Tencent QQ's scraping code and acquired the list of URLs and keywords:

1. S.TAOBAO.COM/SEARCH?

2. LIST.TMALL.COM/SEARCH_PRODUCT.HTM?

3. (30, 0xDDA1029, 0x9E67F3BB, 0xB18ACC45, 0x597CF438): b'', # not yet decoded

4. SEARCH.JD.COM/SEARCH?

Keyword Group 1:

5. 古着 (kanji meaning old cloths)

6. VINTAGE

Keyword Group 2:

7. 融券 (margin trading-short)

8. 融资 (financing)

Keyword Group 3:

9. 炒股 (stock trading)

10. 股票 (stock)

[1]: https://archive.vn/c1ABO post id 229

[2]: https://sm.ms/image/hxiVvDNsf2lFJ7u



The last url has been found by lhprojects through brute-forcing Chinese eCommerce website urls. It's first 30 characters are:

`uland.taobao com/sem/tbsearch?`

Credit to lhprojects https://nbviewer.jupyter.org/github/lhprojects/blog/blob/mas...


I was expecting some clever brute-forcing turns out it's just lucky guess.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: