Find the Droids You’re Looking for With GitHub’s Powerful New Search Tools

GitHub’s Octobi Wan Catnobi. Image: GitHub

Open source is about building on the work of others and not having to reinvent the wheel. But if you can’t find the code you need then you’re stuck reinventing the wheel. Again.

To help you find exactly the wheels your project needs, code hosting giant GitHub has announced a new, much more powerful search tool that peers inside GitHub repositories and offers dozens of filters to help you discover the code you need.

The new search further cements GitHub’s place as the go-to source not just for publishing, but also discovering, code on the web.

While GitHub’s new search lacks the web-wide reach of more general code search engines like Google’s once-mighty Code Search (now a hollow shell of its former self), it’s likely to return more useful results thanks to some nice extras like the ability to see recent activity and narrow results by the number of users, stars and forks.

GitHub’s advanced search page now supports operators like @username to limit results to just your repositories (or another user’s repos), code from only one repository (repo:name) or even code from a particular path within a repo. You can also limit by file extension, repo size, number of forks, number of stars, number of followers, number of repos and user location.

While the advanced operators make a quick way to search, there’s no need to memorize them all. The new advanced search form allows you to craft your query using multiple fields, while it displays the shorthand version at the top the page so you learn as you go.

Under the hood GitHub’s new search is powered by an ElasticSearch cluster which live-indexes your code as you push it to GitHub. The results you see will include any public repositories, as well as any private repositories that you have access to.

The GitHub blog also notes that, “to ensure better relevancy, we’re being conservative in what we add to the search index.” That means, for example, that forks will not be in search results (unless the fork has more stars than the parent repository). While that may mean you occasionally miss a bit of code, it goes a long way toward reducing a problem that plagues many other code search engines — the overwhelming amount of duplicate results.

GitHub’s more powerful search has turned up one unintended consequence — exposed data. It’s much easier to search for anything on the site, including, say, usernames and passwords. As it turns out many people seem to have everything from SSH keys to Gmail passwords stored in public GitHub repos. There’s a discussion about the issue over on Hacker News. The ability to find things like exposed passwords isn’t new, but the new search tool does make it easier than ever. Let this be a reminder of something that’s hopefully obvious to Webmonkey readers — never store passwords or private keys on a public site. And if you find someone doing that, do the right thing and let them know.

For more details on everything that’s new in GitHub’s search page, head on over to the GitHub blog.