dfjones's blurblog

Snowden's Dead Man's Switch by schneier
Wednesday July 24^th, 2013 at 9:57 AM

Schneier on Security

Edward Snowden has set up a dead man's switch. He's distributed encrypted copies of his document trove to various people, and has set up some sort of automatic system to distribute the key, should something happen to him.

Dead man's switches have a long history, both for safety (the machinery automatically stops if the operator's hand goes slack) and security reasons. WikiLeaks did the same thing with the State Department cables.

"It's not just a matter of, if he dies, things get released, it's more nuanced than that," he said. "It's really just a way to protect himself against extremely rogue behavior on the part of the United States, by which I mean violent actions toward him, designed to end his life, and it's just a way to ensure that nobody feels incentivized to do that."

I'm not sure he's thought this through, though. I would be more worried that someone would kill me in order to get the documents released than I would be that someone would kill me to prevent the documents from being released. Any real-world situation involves multiple adversaries, and it's important to keep all of them in mind when designing a security system.

Read the whole story

dfjones

4577 days ago

SJ / NYC

samuel

4583 days ago

Cambridge, Massachusetts

1 public comment

WorldMaker

4582 days ago

I think Bruce just wrote my awesome thriller called "Dead Man's Switch" for me...

Louisville, Kentucky

Victory Lap for Ask Patents by Joel Spolsky
Tuesday July 23^rd, 2013 at 11:14 AM

Joel on Software

There are a lot of people complaining about lousy software patents these days. I say, stop complaining, and start killing them. It took me about fifteen minutes to stop a crappy Microsoft patent from being approved. Got fifteen minutes? You can do it too.

In a minute, I’ll tell you that story. But first, a little background.

Software developers don’t actually invent very much. The number of actually novel, non-obvious inventions in the software industry that maybe, in some universe, deserve a government-granted monopoly is, perhaps, two.

The other 40,000-odd software patents issued every year are mostly garbage that any working programmer could “invent” three times before breakfast. Most issued software patents aren’t “inventions” as most people understand that word. They’re just things that any first-year student learning Java should be able to do as a homework assignment in two hours.

Nevertheless, a lot of companies large and small have figured out that patents are worth money, so they try to file as many as they possibly can. They figure they can generate a big pile of patents as an inexpensive byproduct of the R&D work they’re doing anyway, just by sending some lawyers around the halls to ask programmers what they’re working on, and then attempting to patent everything. Almost everything they find is either obvious or has been done before, so it shouldn’t be patentable, but they use some sneaky tricks to get these things through the patent office.

The first technique is to try to make the language of the patent as confusing and obfuscated as possible. That actually makes it harder for a patent examiner to identify prior art or evaluate if the invention is obvious.

A bonus side effect of writing an incomprehensible patent is that it works better as an infringement trap. Many patent owners, especially the troll types, don’t really want you to avoid their patent. Often they actually want you to infringe their patent, and then build a big business that relies on that infringement, and only then do they want you to find out about the patent, so you are in the worst possible legal position and can be extorted successfully. The harder the patent is to read, the more likely it will be inadvertently infringed.

The second technique to getting bad software patents issued is to use a thesaurus. Often, software patent applicants make up new terms to describe things with perfectly good, existing names. A lot of examiners will search for prior art using, well, search tools. They have to; no single patent examiner can possibly be aware of more than (rounding to nearest whole number) 0% of the prior art which might have invalidated the application.

Since patent examiners rely so much on keyword searches, when you submit your application, if you can change some of the keywords in your patent to be different than the words used everywhere else, you might get your patent through even when there’s blatant prior art, because by using weird, made-up words for things, you’ve made that prior art harder to find.

Now on to the third technique. Have you ever seen a patent application that appears ridiculously broad? (“Good lord, they’re trying to patent CARS!”). Here’s why. The applicant is deliberately overreaching, that is, striving to get the broadest possible patent knowing that the worst thing that can happen is that the patent examiner whittles their claims down to what they were entitled to patent anyway.

Let me illustrate that as simply as I can. At the heart of a patent is a list of claims: the things you allege to have invented that you will get a monopoly on if your patent is accepted.

An example might help. Imagine a simple application with these three claims:

1. A method of transportation
2. The method of transportation in claim 1, wherein there is an engine connected to wheels
3. The method of transportation in claim 2, wherein the engine runs on water

Notice that claim 2 mentions claim 1, and narrows it... in other words, it claims a strict subset of things from claim 1.

Now, suppose you invented the water-powered car. When you submit your patent, you might submit it this way even knowing that there’s prior art for “methods of transportation” and you can’t really claim all of them as your invention. The theory is that (a) hey, you might get lucky! and (b) even if you don’t get lucky and the first claim is rejected, the narrower claims will still stand.

What you’re seeing is just a long shot lottery ticket, and you have to look deep into the narrower claims to see what they really expect to get. And you never know, the patent office might be asleep at the wheel and BOOM you get to extort everyone who makes, sells, buys, or rides transportation.

So anyway, a lot of crappy software patents get issued and the more that get issued, the worse it is for software developers.

The patent office got a little bit of heat about this. The America Invents Act changed the law to allow the public to submit examples of prior art while a patent application is being examined. And that’s why the USPTO asked us to set up Ask Patents, a Stack Exchange site where software developers like you can submit examples of prior art to stop crappy software patents even before they’re issued.

Sounds hard, right?

At first I honestly thought it was going to be hard. Would we even be able to find vulnerable applications? The funny thing is that when I looked at a bunch of software patent applications at random I came to realize that they were all bad, which makes our job much easier.

Take patent application US 20130063492 A1, submitted by Microsoft. An Ask Patent user submitted this call for prior art on March 26th.

I tried to find prior art for this just to see how hard it was. First I read the application. Well, to be honest, I kind of glanced at the application. In fact I skipped the abstract and the description and went straight to the claims. Dan Shapiro has great blog post called How to Read a Patent in 60 Seconds which taught me how to do this.

This patent was, typically, obfuscated, and it used terms like “pixel density” for something that every other programmer in the world would call “resolution,” either accidentally (because Microsoft’s lawyers were not programmers), or, more likely, because the obfuscation makes it that much harder to search.

Without reading too deeply, I realized that this patent is basically trying to say “Sometimes you have a picture that you want to scale to different resolutions. When this happens, you might want to have multiple versions of the image available at different resolutions, so you can pick the one that’s closest and scale that.”

This didn’t seem novel to me. I was pretty sure that the Win32 API already had a feature to do something like that. I remembered that it was common to provide multiple icons at different resolutions and in fact I was pretty sure that the operating system could pick one based on the resolution of the display. So I spent about a minute with Google and eventually (bing!) found this interesting document entitled Writing DPI-Aware Win32 Applications [PDF] written by Ryan Haveson and Ken Sykes at, what a coincidence, Microsoft.

And it was written in 2008, while Microsoft’s new patent application was trying to claim that this “invention” was “invented” in 2011. Boom. Prior art found, and deployed.

Total time elapsed, maybe 10 minutes. One of the participants on Ask Patents pointed out that the patent application referred to something called “scaling sets.” I wasn’t sure what that was supposed to mean but I found a specific part of the older Microsoft document that demonstrated this “invention” without using the same word, so I edited my answer a bit to point it out. Here’s my complete answer on AskPatents.

Mysteriously, whoever it was that posted the request for prior art checked the Accepted button on Stack Exchange. We thought this might be the patent examiner, but it was posted with a generic username.

At that point I promptly forgot about it, until May 21 (two months later), when I got this email from Micah Siegel (Micah is our full-time patent expert):

The USPTO rejected Microsoft's Resizing Imaging Patent!

The examiner referred specifically to Prior Art cited in Joel's answer ("Haveson et al").

Here is the actual document rejecting the patent. It is a clean sweep starting on page 4 and throughout, basically citing rejecting the application as obvious in view of Haveson.

Micah showed me a document from the USPTO confirming that they had rejected the patent application, and the rejection relied very heavily on the document I found. This was, in fact, the first “confirmed kill” of Ask Patents, and it was really surprisingly easy. I didn’t have to do the hard work of studying everything in the patent application and carefully proving that it was all prior art: the examiner did that for me. (It’s a pleasure to read him demolish the patent in question, all twenty claims, if that kind of schadenfreude amuses you).

(If you want to see the rejection, go to Public Pair and search for publication number US 20130063492 A1. Click on Image File Wrapper, and look at the non-final rejection of 4-11-2013. Microsoft is, needless to say, appealing the decision, so this crappy patent may re-surface.)

There is, though, an interesting lesson here. Software patent applications are of uniformly poor quality. They are remarkably easy to find prior art for. Ask Patents can be used to block them with very little work. And this kind of individual destruction of one software patent application at a time might start to make a dent in the mountain of bad patents getting granted.

My dream is that when big companies hear about how friggin’ easy it is to block a patent application, they’ll use Ask Patents to start messing with their competitors. How cool would it be if Apple, Samsung, Oracle and Google got into a Mexican Standoff on Ask Patents? If each of those companies had three or four engineers dedicating a few hours every day to picking off their competitors’ applications, the number of granted patents to those companies would grind to a halt. Wouldn’t that be something!

Got 15 minutes? Go to Ask Patents right now, and see if one of these RFPAs covers a topic you know something about, and post any examples you can find. They’re hidden in plain view; most of the prior art you need for software patents can be found on Google. Happy hunting!

Need to hire a really great programmer? Want a job that doesn't drive you crazy? Visit the Joel on Software Job Board: Great software jobs, great people.

Read the whole story

dfjones

4578 days ago

SJ / NYC

popular

4579 days ago

12 public comments

glindsey1979

4577 days ago

Crowdsource the destruction of software patents. I LOVE THIS.

Aurora, IL

Courtney

4578 days ago

Kill Patent Applications in Your Spare Time, I love it.

Portland, OR

satadru

4578 days ago

Programmers, make this part of your 1% time.

New York, NY

Romanikque

4578 days ago

I wonder if this is going to become a reddit style trolling of industry writ large?

Baltimore, MD

matthewglidden

4578 days ago

The pressure to file patents at my previous job was (and I'm sure remains) high, reinforced at every quarterly meeting. Ugh.

Cambridge, MA

norb

4578 days ago

Crowdsourcing patent applications seems like a great idea.

clmbs.oh

francisga

4578 days ago

Kill all the software patents!!!!

Lafayette, LA, USA

gazuga

4578 days ago

Makes me think there should be a Stack Exchange for every institution that acts on complex questions of fact.

Edmonton

oliverzip

4579 days ago

Want to zero a patent and got a free ten minutes?

Sydney, Balmain, Hornsby.

lkraav

4579 days ago

Still, a great initiative.

Tallinn, Estonia

bluegecko

4579 days ago

One patent down. Several thousand to go.

New York, NY

Ask HS: What's Wrong with Twitter, Why Isn't One Machine Enough? by Todd Hoff
Thursday July 18^th, 2013 at 3:02 PM

High Scalability

Can anyone convincingly explain why properties sporting traffic statistics that may seem in-line with with the capabilities of a single big-iron machine need so many machines in their architecture?

This is a common reaction to architecture profiles on High Scalability: I could do all that on a few machines so they must be doing something really stupid.

Lo and behold this same reaction also occurred to the article The Architecture Twitter Uses to Deal with 150M Active Users. On Hacker News papsosouid voiced what a lot of people may have been thinking:

I really question the current trend of creating big, complex, fragile architectures to "be able to scale". These numbers are a great example of why, the entire thing could run on a single server, in a very straight forward setup. When you are creating a cluster for scalability, and it has less CPU, RAM and IO than a single server, what are you gaining? They are only doing 6k writes a second for crying out loud.

This is a surprisingly hard reaction to counter convincingly, but nostrademons has a triple great response:

They create big, complex, fragile architectures because they started with simple, off-the-shelf architectures that completely fell over at scale.

I dunno how long you've been on HN, but around 2007-2008 there were a bunch of HighScalability articles about Twitter's architecture. Back then it was a pretty standard Rails app where when a Tweet came in, it would do an insert into a (replicated) MySQL database, then at read time it would look up your followers (which I think was cached in memcached) and issue a SELECT for each of their recent tweets (possibly also with some caching). Twitter was down about half the time with the Fail Whale, and there was continuous armchair architects about "Why can't they just do this simple solution and fix it?" The simple solution most often proposed was write-time fanout, basically what this article describes.

Do the math on what a single-server Twitter would require. 150M active users * 800 tweets saved/user * 300 bytes for a tweet = 36T of tweet data. Then you have 300K QPS for timelines, and let's estimate the average user follows 100 people. Say that you represent a user as a pointer to their tweet queue. So when a pageview comes in, you do 100 random-access reads. It's 100 ns per read, you're doing 300K * 100 = 30M reads, and so already you're falling behind by a factor of 3:1. And that's without any computation spent on business logic, generating HTML, sending SMSes, pushing to the firehose, archiving tweets, preventing DOSses, logging, mixing in sponsored tweets, or any of the other activities that Twitter does.

(BTW, estimation interview questions like "How many gas stations are in the U.S?" are routinely mocked on HN, but this comment is a great example why they're important. I just spent 15 minutes taking some numbers from an article and then making reasonable-but-generous estimates of numbers I don't know, to show that a proposed architectural solution won't work. That's opposed to maybe 15 man-months building it. That sort of problem shows up all the time in actual software engineering.)

And the thread goes on with a lot of enlightening details. (Just as an aside, in an interview the question "How many gas stations are in the US" is worse than useless. If someone asked for a Twitter back-of-the-napkin analysis like nostrademons produced, now we are getting somewhere.)

Do you have an answer? Are these kind of architectures evidence of incompetence or is there a method to the madness?

Read the whole story

dfjones

4583 days ago

SJ / NYC

How Twitter Keeps Its Billions Of Messages Flying Through The Air by Kit Eaton
Tuesday July 9^th, 2013 at 12:02 AM

Co.Labs

Twitter looks like a simple thing, doesn't it? A simple network of people who follow each other, short snippet messages of 140 characters or less, no tricky privacy controls, VIP friends, or any of the shenanigans of Facebook. When it started, it absolutely was this simple from an infrastructure point of view. But Twitter's Raffi Krikorian, VP of Engineering, just gave a presentation that points out that keeping Twitter's fail whale from appearing for hundreds of millions of users around the world was far from the simple task of scaling up the early system. The mammoth job of simply making sure a tweet from a popular user navigates the infrastructure and gets out to the community on time may make you think twice about complaining about your own database problems.

Krikorian revealed some figures that show the scale of Twitter's problems: It has 150 million active users around the world, and the data going to and from these folk--that's 400 million tweets a day--squeezes through a 22 MB/second firehose. If Lady Gaga, with 31 million followers, sends a tweet it can take up to about 5 minutes for those short 140 characters to reach her fans. Because Twitter is much more of a consumption platform than an input platform, the company has configured its entire infrastructure to support the bias: The code actually does a lot of processing the moment tweets arrive to figure out where they need to go--this means when tweets are "read" through an API call, the process is much quicker than if the processing happened at this point.

There are several other tricks Twitter uses, such as keeping track of active users and storing their data more accessibly than occasional user's info, and storing a bunch of data in RAM for speedy lookups. By a bunch, I mean a lot: Every active user's code is stored in RAM to lower latencies.

Propelling a lot of Twitter's thinking about its infrastructure is that it's now no longer a simple web app, or even a smartphone app: It's a coherent set of APIs for delivering messages accurately on a vast scale and in near-real-time to a diverse userbase. It's this API set that effectively is Twitter's core asset, and tied to advertising it's the key to more revenue in the future.

It's worth reading the précis of Krikorian's talk at HighScalability.com and the original content itself to learn more details. You may even glean some ideas for your next big data project.

[Image: By Flickr user Les Chatfield]

Read the whole story

dfjones

4592 days ago

SJ / NYC

Douglas Engelbart (1925-2013)
Monday July 8^th, 2013 at 11:32 AM

Actual quote from The Demo: '... an advantage of being online is that it keeps track of who you are and what you’re doing all the time ...'

Read the whole story

dfjones

4593 days ago

SJ / NYC

popular

4593 days ago

5 public comments

rorypatt

4593 days ago

Birth(?) of online - 1968. RIP Douglas Engelbart.

reconbot

4595 days ago

For the mobile users: Actual quote from The Demo: '... an advantage of being online is that it keeps track of who you are and what you’re doing all the time ...'

New York City

jdunning

4596 days ago

YOLO!

taddevries

4596 days ago

I often feel like I've been here before when surfing the web. I guess that is the fallout of 20 years online.

adamgurri

4596 days ago

visionary

New York, NY

Organizational Skills Beat Algorithmic Wizardry by James Hague
Monday July 8^th, 2013 at 12:06 AM

Programming in the 21st Century

I've seen a number of blog entries about technical interviews at high-end companies that make me glad I'm not looking for work as a programmer. The ability to implement oddball variants of heaps and trees on the spot. Puzzles with difficult constraints. Numeric problems that would take ten billion years to complete unless you can cleverly analyze and rephrase the math. My first reaction is wow, how do they manage to hire anyone?

My second reaction is that the vast majority of programming doesn't involve this kind of algorithmic wizardry.

When it comes to writing code, the number one most important skill is how to keep a tangle of features from collapsing under the weight of its own complexity. I've worked on large telecommunications systems, console games, blogging software, a bunch of personal tools, and very rarely is there some tricky data structure or algorithm that casts a looming shadow over everything else. But there's always lots of state to keep track of, rearranging of values, handling special cases, and carefully working out how all the pieces of a system interact. To a great extent the act of coding is one of organization. Refactoring. Simplifying. Figuring out how to remove extraneous manipulations here and there.

This is the reason there are so many accidental programmers. You don't see people casually become neurosurgeons in their spare time--the necessary training is specific and intense--but lots of people pick up enough coding skills to build things on their own. When I learned to program on an 8-bit home computer, I didn't even know what an algorithm was. I had no idea how to sort data, and fortunately for the little games I was designing I didn't need to. The code I wrote was all about timers and counters and state management. I was an organizer, not a genius.

I built a custom a tool a few years ago that combines images into rectangular textures. It's not a big program--maybe 1500 lines of Erlang and C. There's one little twenty line snippet that does the rectangle packing, and while it wasn't hard to write, I doubt I could have made it up in an interview. The rest of the code is for loading files, generating output, dealing with image properties (such as origins), and handling the data flow between different parts of the program. This is also the code I tweak whenever I need a new feature, better error handling, or improved usability.

That's representative of most software development.

(If you liked this, you might enjoy Hopefully More Controversial Programming Opinions.)

Read the whole story

dfjones

4593 days ago

SJ / NYC

popular

4611 days ago

8 public comments

squinky

4608 days ago

Programming tests. I hates them.

Santa Cruz, CA

LeMadChef

4610 days ago

Please answer this really hard problem of the type you will never find in this job. Also, if you try any of your clever "algorithms" here you will be pulled aside and given a stern talking-to about readability and code maintenance. We don't want any cowboy coders here!

Denver, CO

rikishiama

4610 days ago

God I wish my boss -- who's been trying to find a suitable (to him) programmer for over 3 months -- would read this. Actually, I wish he could *understand* this, let alone read it.

zwol

4611 days ago

QFT: "...the number one most important [programming] skill is how to keep a tangle of features from collapsing under the weight of its own complexity."

Pittsburgh, PA

wffurr

4611 days ago

Agreed, but how do you interview for organizational skills while also ensuring they know how to write a for loop or recursive function?

Cambridge, MA

wmorrell

4611 days ago

We ask the following: in language and environment of your choice, write a four-function calculator; i.e. a program which will take two numeric operands, an operator (addition, subtraction, multiplication, division), and produce a correct answer. You may use whatever tools you like, have full internet access, and can ask us for help. If the candidate succeeds in writing this in about an hour, we ask hir to add in a primality test, showing some indicator if the answer is a prime number. I've found it's a good test to show how a candidate handles vague requirements, basic I/O, simple state management and/or parsing, refactoring in the face of changing requirements. It's simple enough to accomplish in an interview, and covers the basic skills needed to create a "real" program.

acdha

4611 days ago

For non-junior positions I prefer to ask them to describe real problems they've worked on and how they addressed them. Trying to spring a full coding question is time-consuming and stressful but what I really want to hear is how well they understand the challenges and approaches – if they get things like loose coupling it's obvious in both what they dislike and how they propose dealing with it. These days Github is also invaluable for seeing code-as-practiced

jepler

4611 days ago

at $DAY_JOB, a very tractable (O(n) even if you're naive) problem is the interview programming question. sadly, still a good weeder question.

Earth, Sol system, Western spiral arm

Snowden's Dead Man's Switch by schneier Wednesday July 24th, 2013 at 9:57 AM

Victory Lap for Ask Patents by Joel Spolsky Tuesday July 23rd, 2013 at 11:14 AM

Ask HS: What's Wrong with Twitter, Why Isn't One Machine Enough? by Todd Hoff Thursday July 18th, 2013 at 3:02 PM

How Twitter Keeps Its Billions Of Messages Flying Through The Air by Kit Eaton Tuesday July 9th, 2013 at 12:02 AM

Douglas Engelbart (1925-2013) Monday July 8th, 2013 at 11:32 AM

Organizational Skills Beat Algorithmic Wizardry by James Hague Monday July 8th, 2013 at 12:06 AM

Snowden's Dead Man's Switch by schneier
Wednesday July 24^th, 2013 at 9:57 AM

Victory Lap for Ask Patents by Joel Spolsky
Tuesday July 23^rd, 2013 at 11:14 AM

Ask HS: What's Wrong with Twitter, Why Isn't One Machine Enough? by Todd Hoff
Thursday July 18^th, 2013 at 3:02 PM

How Twitter Keeps Its Billions Of Messages Flying Through The Air by Kit Eaton
Tuesday July 9^th, 2013 at 12:02 AM

Douglas Engelbart (1925-2013)
Monday July 8^th, 2013 at 11:32 AM

Organizational Skills Beat Algorithmic Wizardry by James Hague
Monday July 8^th, 2013 at 12:06 AM