Fun fact about Google Scholar: it’s "free", but it’s just another soulless Google product - no clear strategy, no support, and a fragile proprietary dependency in what should be an open ecosystem. This creates inherent risks for the academic community. We need the equivalent of arXiv for Google Scholar
For people unfamiliar, Semantic Scholar is run by the Allen Institute and has been researching accurate AI summarization and semantic search for years. Also they have support for author name changes.
It advertises itself as "from all fields of science" -- does that includes fields like economics? Sociology? Political science? What about law journals? In other words, is the coverage as broad? And if it doesn't include certain fields, where is the "science" line drawn?
And I'm curious if people find it to be as useful (or more) just in terms of UX, features, etc.
They are substantially smaller in coverage, but have higher quality in my experience. Remarkably, they are also willing to correct their data if you notify them. This of course in is stark contrast to Google Scholar where the metadata of papers is frequently wildly inaccurate. On top of this, Semantic Scholar shares their underlying data (although you need to request an API key). Overall, they have been growing slowly and steadily over the years and I have a lot of respect for what their team is doing for researchers such as myself.
Now for the less great.
They are pushing the concept of "Highly Influential Citations" [1] as their default metric, which to the best of my knowledge is based on a singular workshop publication that produced a classifier trained on about 500 training samples to classify citations. I am a very harsh critic of any metrics for scientific impact. But this is just utter madness. Guaranteeing that this metric is not grossly misleading is nearly impossible and it feels like the only reason they picked it is because Etzioni (AI2 head) is the last author of the workshop paper. It should have been at best a novelty metric and certainly not the default one.
Recently, they introduced their Semantic Reader functionality and are now pushing it as a default way to access PDFs on the website. Forcing you to click on a drop down to access plain PDFs. It may or may not be a great tool, but it feels somewhat obvious that they are attempting to use shady patterns to push you in the direction they want.
Lastly, they have started using Google Analytics. Which is not great, but I can understand why they go for the industry default.
Overall, I use them nearly daily and they are the best offering out there for my area of research. Although, I at times feel tempted to grab the data and create an alternative (simpler) frontend with fewer distractions and "modern" web nonsense.
Semantic Scholar's search is pretty good, but there are also a variety of other (paid) projects that expand on its API. Look at tools like Scite and LitMaps for what's possible with the semantic scholar dataset.
As for coverage, I think it focuses more on the life sciences, but I'm not positive about that.
I did a test across all Google Scholar alternatives I could find a few months ago. I got the same feelign like after Google Reader seized to exist. Literally nothing filled the gap.
My conclusion is that any such system needs to be "complete" or almost complete to be useful. By system, I mean a service or some handcrafted system where I could track anything. In all fairness, Sci-Hub partially fits the bill here and it's a big plus to society.
But the point is Google Scholar is complete in the sense that with a high probability I will find any paper I'm looking for along with reliable metadata. That's great, but the fact that they go above and beyond to prevent sharing that data is IMO backwards, against all academic research principles and this should raise questions within the research communities that rely on it.
Yes. On one hand I’d like Google to improve things a bit. There are some rough edges, which is a shame because it indexes some things that are not in Scopus or Web of Knowledge, like theses and preprint repositories. On the other hand I worry that some manager somewhere would kill it if they realised that it is still around.
Every 1-2 months when Chrome updates I get banned by their throttling mechanism because I their extension makes too many requests and they see "unusual traffic"
It can take 1-2 weeks to go away and be able to use it. There's no way to get in contact with anyone. Tried the Chrome extension email, support forums.
It's a good reality check. There's no real support behind it and it can go away just like Google Reader did.
I think the motivations behind it are laudable, but they should not be the answer to the actual problem.
I’m fairly sure they only exist because Larry/Sergei might give half a fuck if they killed it outright, and it has a small enough team that the cost savings for killing aren’t enough for Ruth to want to make that argument.
I miss the Google of yesteryear which had an altruistic streak and felt that enriching the world's ability to share and process information would ultimately accrue benefit to Google as well.
The Google of today is far more boring and less helpful.
Its a hard job to maintain systems in an altruistic state, cause opportunists and parasites are drawn in larger and larger numbers to where ever resources accumulate.
Google has a decent job not turning fully into an Oracle for example.