The change means that you can easily get to a list of all your Gists by heading to https://gist.github.com/<username>/.
Gists, which started off as a simple way to dump and share snippets and short pieces of reusable code (something akin to the older Pastebin), were recently upgraded to be full-fledged Git repos behind the scenes. That means Gists are automatically versioned, forkable and usable as Git repos, complete with diffs.
Now that Gists are considerably more than just Pastebin-style code snippets, it makes sense to offer users a quick and easy way to get to their Gists from anywhere thanks to a memorable URL.
The newly personalized Gists come with an automatic URL redirect. So if your Gist used to live at https://gist.github.com/4731290 it will now be redirected to https://gist.github.com/luxagraf/4731290. As some GitHub users point out on Hacker News, there’s a flaw in GitHub’s system that means anyone can register a numeric username and cause a Gist to redirect to the wrong page. Hopefully GitHub will fix that in the near future..
Open source is about building on the work of others and not having to reinvent the wheel. But if you can’t find the code you need then you’re stuck reinventing the wheel. Again.
To help you find exactly the wheels your project needs, code hosting giant GitHub has announced a new, much more powerful search tool that peers inside GitHub repositories and offers dozens of filters to help you discover the code you need.
The new search further cements GitHub’s place as the go-to source not just for publishing, but also discovering, code on the web.
While GitHub’s new search lacks the web-wide reach of more general code search engines like Google’s once-mighty Code Search (now a hollow shell of its former self), it’s likely to return more useful results thanks to some nice extras like the ability to see recent activity and narrow results by the number of users, stars and forks.
GitHub’s advanced search page now supports operators like @username to limit results to just your repositories (or another user’s repos), code from only one repository (repo:name) or even code from a particular path within a repo. You can also limit by file extension, repo size, number of forks, number of stars, number of followers, number of repos and user location.
While the advanced operators make a quick way to search, there’s no need to memorize them all. The new advanced search form allows you to craft your query using multiple fields, while it displays the shorthand version at the top the page so you learn as you go.
Under the hood GitHub’s new search is powered by an ElasticSearch cluster which live-indexes your code as you push it to GitHub. The results you see will include any public repositories, as well as any private repositories that you have access to.
The GitHub blog also notes that, “to ensure better relevancy, we’re being conservative in what we add to the search index.” That means, for example, that forks will not be in search results (unless the fork has more stars than the parent repository). While that may mean you occasionally miss a bit of code, it goes a long way toward reducing a problem that plagues many other code search engines — the overwhelming amount of duplicate results.
GitHub’s more powerful search has turned up one unintended consequence — exposed data. It’s much easier to search for anything on the site, including, say, usernames and passwords. As it turns out many people seem to have everything from SSH keys to Gmail passwords stored in public GitHub repos. There’s a discussion about the issue over on Hacker News. The ability to find things like exposed passwords isn’t new, but the new search tool does make it easier than ever. Let this be a reminder of something that’s hopefully obvious to Webmonkey readers — never store passwords or private keys on a public site. And if you find someone doing that, do the right thing and let them know.
For more details on everything that’s new in GitHub’s search page, head on over to the GitHub blog.
Gists are a way to dump and share snippets and short pieces of reusable code — too short to bother creating a full-fledged Git repository, but something you’d like to save and share nonetheless — covering roughly the same use case as something like the much older Pastebin. Or at least that used to be the case.
The new gists are considerably more powerful. The rewrite actually turns gists into full Git repositories, so they are automatically versioned, forkable and usable as Git repos, complete with diffs.
Gists are also now searchable — complete with the ability to filter searches by language — and there’s a new Discover page as well.
Like normal GitHub repos, gists now offer the Ace code editor with its syntax highlighting and automatic indenting. While the Ace editor is nice, my favorite way to create gists is through editor plugins like this one for Vim, this one for Emacs or this one for Sublime Text 2.
The “horrible thing” in developer Erik Rose’s talk from this year’s PyCon is the Mediawiki syntax, but that’s just a jumping off point for one of the best overviews of data parsing that I’ve run across. If you’ve got a project that involves parsing, or are, like me, considering one, this talk is a must-watch.
This is PyCon, so much of the talk focuses on parsing in Python, but there’s plenty of broader, dare I say, “parsing philosophy” that make it well worth a watch even if you don’t end up using the specific Python parsing libraries Rose mentions.
The Voynich Manuscript was very poorly commented. Image: Wikimedia
We’ve written before about the value of writing your README before your code, but what about when it comes to the actual code? Terse one-liners? Paragraph-long descriptions? How much is enough and when is it too much?
How to comment code is a perennial subject of debate for programmers, one that developer Zachary Voase recently jumped into, arguing that one of the potential flaws with extensive comments (or any comments really) is that they never seem to get updated when the code changes. “We forget,” writes Voase, “overlooking a comment when changing the fundamental behavior of semantics of the code to which it relates.”
Voase thinks the solution is in our text editors, which typically “gray out” comments, fading then into the background so we can focus on the actual code. We ought to do the opposite, he believes: Make the comments jump out. Looking at the visual examples on Voase’s post makes the argument a bit more compelling. Good text editors have configurable color schemes so it shouldn’t be too hard to give this a try and see if it improves your comments and your code.
Another approach is to treat comments as a narrative. Dave Winer recently mentioned comments in passing, writing about the benefits of using an outliner to handle comments since it makes it easy to show and hide them:
Another thing that works is the idea of code as a weblog. At the top of each part there’s a section where each change is explained. The important thing is that with elision (expand/collapse) comments don’t take up visual space so there’s no penalty for fully explaining the work. Without this ability there’s an impossible trade off between comments and the clarity of comment-free code. No manager wants to penalize developers for commenting their work. With this change, with outlining, that now works.
Donald Knuth, author of the seminal book, The Art of Computer Programming, advocated a similar narrative approach with what he called Literate Programming. Literate programming seeks to weave comments and docs out of a “literate” source.
Then there’s the opposite school of thought that says your code should always be so clear and so obvious as to never need comments. See Slashdot for quite a few people advocating this approach, most of whom we suspect have never had to go back and read through their code again years after it was written.
The best way to comment your code is up to you, but whichever path your team decides to follow the best advice is to make sure you take the time to actually have a plan for comments. The most useless comments are haphazardly written, which also makes them unlikely to be updated when the code changes. There are as many approaches as there are programmers; just make sure you actually settle on one and stick with it. Down the road, when it’s time to update that older code you’ll thank yourself.