Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

That might be fixable, if people want to expend the effort. Wiki pages are almost certainly copyrightable, so the owners could send DMCA takedown notices to github-wiki-see.page. If they're not responsive, send the DMCA notices to Google, which should be required to delist them. Unfortunately you have to do it on a URL-by-URL basis, and you can only send notices for pages you actually own copyright for, so it would mean a big coordinated effort to get them brought down.

I just don't understand why Google themselves allows this and doesn't rank these sorts of sites lower. They're clearly garbage sites with low utility.



Please read my explanation at http://github-wiki-see.page/ and observe why it exists. I believe it to be a site with extremely high utility.

It has already recently convinced/defrosted GitHub to gradually change their policy to not let GitHub wiki pages be indexed since 2012. For at least 9 years, people were writing content into GitHub and not realizing it wasn't indexed at all.

I'm happy to answer any questions or suggestions you have.


I also do not host the content at all. That said, people have submitted outdated content requests if they move off GitHub Wikis to Google and they are honored.


Google puts substantial effort into identifying copycat content. The main way they do that is to see which site had the content first.

Unfortunately with smaller sites, it could be a few days till their search bot finds the content, and often the copycat sites have agressive scrapers so appear to have the content first.

From googles point of view, the copycat is the original, and the original is the copycat.

There are also some kinds of copycat content which users actually prefer. For example, sites which bypass paywalls, sites which quote other sites, sites that display decrapified content from another site, etc.


In the case of http://github-wiki-see.page/, the original isn't even on Google! That's why my copycat wins.

FWIW, GitHub seems to be letting some Wikis be indexed on a test basis and I am very happy to see they are outranking GHWSEE. That said, with the current guessed criteria, there are still many publicly editable wikis with many stars and publically un-editable wikis on repos with few stars but useful information out there that aren't being indexed.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: