NerdyData Search Engine and Interview


In the sea of SEO tools industry it is getting more difficult to find quality tools and distinguish them from the rubble that is being dangled before us. But from time to time a new tool emerges that we instantly sink our teeth to, one of those tools that got our attention is NerdyData search engine.
We’ve played with it for some time and decided we like it, here are some of our takeaways from our testing time and the interview with one of the founders of NerdyData, Steven Sonnes. In summary NerdyData is a source code search engine that help us find every bit of code we want, although to some of you this may sound unimportant we think otherwise, here are some of the things, aside the obvious ones, that you can use NerdyData for:
– find all clients by one SEO company
– find link/blog networks
– discover adsense empires
– find out who’s using your template
– “sourcecode stamping” your work so you know when it’s being plagerised
That is just from the top of our mind, there are some very nice ways to use NerdyData for good and evil! The best way to find out is to visit the site and play around, but if you want more info first here is what one of the founders had to say.

1. How did you come up with the idea to create NerdyData?

We needed a way to search among millions of websites for a particular snippet of code, and then find backlinks to those matching domains. It was the surprising realization that this service did not exist that inspired us to work day and night on NerdyData. Everyone can press CTRL+U on a browser and search through one page’s code, but no one can search code across millions of sites in under a second – so we’re the first to do that.

2. What uses did you have in mind when creating this source code search engine?

At first, we were interested in the backlink research capabilities, and lately we’ve been focusing on the great uses it has for SEO research, market analysis, and usage for programmers (and hackers!).

Our target market is SEO agencies looking for insights into meta data, code usage, or other snippets of source code. We offer many tools, like visual backlink discovery, meta search interfaces, and a variety of pre-parsed easy-to-use HTML tag searches. If you or your competition has a particular piece of code, like a Google Analytics account ID or a  specific Javascript library, you can easily download the list of domains that match that unique query.

Some interesting example uses are:

    – Help You Find New Leads and understand your competitors: http://blog.nerdydata.com/post/57308630996/how-we-found-all-of-optimizleys-clients

    – Discover Trends (example: how many people use jquery vs bootstrap)

    – Find all sites that contain links pointing to any given URL

    – Developers And Designers Can Find Examples of Code or theme Usage

    – Security researchers can find all sites with a certain vulnerability or old plug-in

3. Any new plans on expanding your search engine?

Growth. We’ve started our plans to grow our index and crawl more webpages. We aim to be the one-stop-shop for customizable lead searches. We also have some new interfaces we’re starting to build out, and some premium users might see those roll out in a month or so.

Custom Solutions: We’ve also had great success generating customized reports for clients. By offering fully customizable queries we provide data tailored to a specific client or niche industry. There are so many ways to query our data, for example : “Find all websites that have ‘Web Design’ in the title tag, ‘responsive’ in the meta description, and use Google Analytics”, or “All sites that use WordPress and Flash objects that have ‘real estate’ in the meta description.”

4. Do you crawl homepages only or deep pages too?

We crawl every webpage we can store on our drives. Currently, our index includes the homepages of all .com, .net, .org, .biz, .us, and .info domains, as well as deeper pages of popular domain names. Our crawler has visited over 140 million unique sites, and we’ve collected terabytes of HTML, Javascript, and CSS code.

5. What are some innovative things we can do with NerdyData?

Being able to search through <meta> tags and <title> tags are great for SEO folks – it’s an interesting way to perform keyword research. You can search raw source code of real websites, not just pre-parsed tags or offline code projects.  We also offer unique and easy-to-use search interfaces for non-technical users, and we have plans to keep adding new products and features. We understand that source code can be overwhelming, so we’re trying to bridge that gap for the non-technical folks and use this to our advantage.

6. Do you have a tip for potential and existing users on how to use NerdyData?

Querying on NerdyData is a little different than a traditional search engine, where most queries a user makes are alphanumeric strings. We have parsed source code in a way that ensures the majority of users will find accurate results for their queries, without having to redefine their search.

For most terms, we will do an exact match search and return results that match pages where the term is exactly the same as what was entered into the search box. A search for “Michael Jordan” would find pages where the words “Michael” and “Jordan” appear next to each other on the page, and not pages that only happen to contain the words “Michael” and “Jordan” on them individually. For searches with many distinct words, we may also return results that contain partial matches, along with exact match results.

https://search.nerdydata.com/features

https://search.nerdydata.com/documentation

https://search.nerdydata.com/leads

7. From our testing we found a few small bugs, and some repeating results, but also some results no other premium tool was able to catch, including even GWT, what are your thoughts on this issue?

We index links that are present on page load (in <a> tags) as well as links rendered after page load (via javascript).  This approach lets us discover links that others tools can not, such as links from widgets and links that are not viewable by simply clicking view source in your browser (ajax, json, dynamic content).  The tradeoff of rendering pages is that it requires much more server processing and storage capacity.  Our index is massive, and at present takes a few weeks to recrawl and to make the results searchable, and so it is possible that links may be indexed that are no longer present on the pages we show results for (if they were removed in the last 30 days).
As far as seeing repeating results, we are aware of a  bug that causes duplicate entries to appear for certain queries, and it will be fixed in our next code push this month.
Steve’s answers provide a good insight into what NerdyData can do, and from our tests it sure can do a few more things. We were able to find sites that accept certain type of payment options, like PayPal, also we were able to locate sites that mention a brand without linking to that brand, and there are some premium tools that just offer that functionality. Of course other things like finding network sites, sites using the same adsense or Google analytics were easy to find with NerdyData search…
This is definitely a search engine you want to check out if you already haven’t. If you did or if you decide to do so, please share your thoughts and finding in the comments, especially if you find a use we haven’t mention in this post.

Dan Petrovic, the managing director of DEJAN, is Australia’s best-known name in the field of search engine optimisation. Dan is a web author, innovator and a highly regarded search industry event speaker.
ORCID iD: https://orcid.org/0000-0002-6886-3211

0 Points


3 thoughts on “NerdyData Search Engine and Interview”

  1. Chad greenberg says:

    Interesting find! lots of different uses in the world of SEO and lead generation. Anyone know where the pricing page is?

  2. mrmarchuk says:

    Neat, but do they offer an option for webmasters to block the indexing of their source code in this search engine? I could see how quite a few people might *not* want their source code so easily searched. For me, it doesn’t matter too much, but it just makes sense that a search engine would have the option to block their robots from a site at the request of a webmaster.

  3. igl00 says:

    Love the idea!