In my day job, I have to write reams of complicated code that can slow down the ...

derefr · on May 20, 2015

> Nobody worth having as a customer is going to build their business on your platform if your attitude whenever something goes wrong is "you shouldn't have been doing that in the first place".

My favorite set of APIs is AWS. You know why? Because they've realized they hold two very weighty sticks that they can use when designing, and they've put them in place all over.

1. They can make any arbitrary message to the API cost the user money every time they send it, to disincentivize using that part of the API thoughtlessly. That's whether or not they expect this to be an actual revenue stream at the rates people are charged for reasonable usage.

2. They can put a "soft cap" on any arbitrary resource, so that you have to phone them and get the cap raised if you want more than [some reasonable number] of something. This likewise disincentivizes bad designs that use a nigh-infinite number of costly somethings to accomplish tasks that could be just as easily accomplished some more idiomatic, less costly way.

AWS doesn't prevent you from doing stupid things... but it makes you really not want to. I love it.

click170 · on May 21, 2015

Have you ever tried to setup AWS IAM permissions for a user pursuant to the principle of least privilege? Because Amazon's APIs are about as far from friendly as you can get in this respect.

Their docs make it easy to make the mistake of thinking that fine-grained controls are available for most things, but when it comes to really important things like being able to segregate a production and Dev VPC, their APIs basically force you to grant permissions to everything or nothing.

Some examples of things I've hit: Not being able to restrict a user to only change a specific routing table Not being able to restrict a user to only change a specific elastic NIC

I'm consistently surprised at what's missing from their API and couldn't disagree more about being happy with it.

derefr · on May 21, 2015

These things are possible... but this gets at another aspect of AWS's design in particular.

I do a lot of my AWS work in CloudFormation. When I hit a wall, the answer is pretty much always to stand up an EC2 instance that can speak SNS, grant it larger-than-necessary permissions to my VPC, teach CloudFormation about it as a custom resource type, and have it serve as a proxy for the not-configurable-enough resource, allowing it to assert its own policy and make third-party calls before making the real callback into your VPC [or not.] It's the AWS equivalent of writing a factory method to wrap a badly-written constructor.

To generalize that thought: IAM "users" are made to either be people (e.g. your developers, your ops people), or representative tokens for entire third-party organizations (e.g. a CI bot.) Despite the existence of IAM roles, IAM isn't really made to assert "machine-agent"-granular permissions.

Instead, what you really want is to imagine a third-party service running in the AWS cloud that does exactly what you want. You would grant that third-party's IAM user overly-wide permission to play with your VPC, but trust it to only do what it should, because, obviously, you have a business relationship and it would be dumb of them to abuse it.

As soon as you can see what API needs to exist, you can turn around and become that very same imaginary third-party: make a separate AWS account, stand up an API server in it that takes requests to do what your "clients" want, and then, in turn, make requests to the AWS APIs on their behalf to accomplish those things.

AWS isn't a high-level framework; it's a kit of low-level tools. (This is really what the PaaS vs IaaS distinction implies, I think.) AWS is built assuming that you're willing and able to take their tools and pipe/script them together to build the higher-level components you need. And, since AWS is for web services, that assumption comes in the form of expecting you to be able to pipe, hook, or wrap any of their APIs to/with/in your own API.

vacri · on May 21, 2015

The 'fine-grain' of IAM varies considerably depending on which AWS service you're restricting. You can add extra flexibility with 'Conditions', which I'm sure you're aware of, but I think it's a bit of a misrepresentation to paint IAM as being poor quality. AWS is a very complex environment; I can't see how you could have a user-friendly yet fine-grained user control for something that complex. Anything you choose is going to require training in how to use it.

I wouldn't say I'm happy about it, but neither am I unhappy, and neither am I happy about anything in the world of security (also in today's task list is updating https cipher lists... again...). Not even the simplest thing in security is easy. For example, the basic concept of a password is simple, but actually implementing it? Ugh - it involves every layer from backend to frontend to user training (the hardest part - no sticky notes, no friendly phone calls, no passing around in emails...).

Anyway, for those not used to IAM 'Conditions', an example of use. The following allows Packer (an AMI builder) to destroy any EC2 instance, but only if they have the tag 'name' as 'Packer Builder'. Conditions don't work for everything, so they're not a workaround to get fine-grain everywhere, but they do add a lot of flexibility.

            "Sid": "AllowInstanceActions",
            "Effect": "Allow",
            "Action": [
                "ec2:StopInstances",
                "ec2:TerminateInstances",
                "ec2:AttachVolume",
                "ec2:DetachVolume",
                "ec2:DeleteVolume"
            ],
            "Resource": [
                "arn:aws:ec2:us-east-1:xxxxxx:instance/*",
                "arn:aws:ec2:us-east-1:xxxxxx:volume/*",
                "arn:aws:ec2:us-east-1:xxxxxx:security-group/*"
            ],
            "Condition": {
                "StringEquals": {
                    "ec2:ResourceTag/Name": "Packer Builder"
                }
            }

digi_owl · on May 21, 2015

No wonder everyone wants to be in the *aaS business. That sounds like the analog telco days, where long distance were billed harshly to get people to keep their calls short. Thus getting the most out of a low number of wires.

angersock · on May 20, 2015

There's a balance, though. Oftentimes, you serve the business folks better by preventing them from doing something really silly, and instead letting them optimize their workflow not to include silliness.

Unfortunately, it's tricky to tell ahead of time which features are bad ideas and which aren't.

tjradcliffe · on May 20, 2015

It's pretty easy at first order: features that users want are good ideas. I agree at second order users want a lot of features that solve their problems in ways that are less clever than one might like, but in the vast majority of cases the right answer from development is, "Users know what they need better than I do, so if they ask for something, I'll do my best to implement it even if I don't totally understand why."

That answer exposes the deeper answer, too: "My job as a developer is to understand user needs so that our software can help them fulfill them, so if I don't understand why someone needs a feature I should dig in further before implementing it so I don't implement the wrong thing or the right thing in the wrong way." Not always possible given shedule constraints, though.

Admitedly, those rules don't seem to apply in this case, since if the OS allows you to do something that corrupts data, that's a problem no matter why the user wants to do it. If it can corrupt data, the OS shouldn't allow you to do it, end of story.

nathanb · on May 20, 2015

It's not that simple (speaking as someone who works on OS SCSI driver code for a living...).

I will amend your statement to say that the OS itself should never corrupt your data, which I agree with entirely.

The post was not super clear about exactly what was happening (or it may be that my limited knowledge of Windows storage internals is keeping me from understanding it), but it sounds like the NTFS client requested a cache flush and then was issuing writes during the flush. I don't know what contract these operations have, but it may very well be the case that the user was violating the contract. If Microsoft responded with "don't do that", this may be the case.

But wait! Shouldn't Windows prevent the data from being corrupted? Or shouldn't NTFS fail the writes in this case? Possibly. And most likely inserting the checks to make this happen would increase write latency for every NTFS client, even the ones that don't behave in this way.

This reminds me of another scenario I encountered, with Veritas VxFS running on top of AIX. The user initiated a space reclamation, which was sending what you can think of as a delete to the storage array. And the user was also writing data to the device at the same time. Due to a race condition (which I can describe for you if you really care), the legitimate user data would sometimes be deleted.

Should VxFS have protected users against this case? Yes. Was VxFS violating the SCSI protocol? No. Was the storage array violating the SCSI protocol? No. (Has VxFS fixed this bug, almost three years after I discovered it? No comment.)

It's always a lot more complicated than it seems on the outside.

chetanahuja · on May 20, 2015

Sorry I don't buy that. An operating system's "contract" with the user is the syscall API. There's no room for argument there.

Calling a write while flushing the same data from another process (as the OP reported) or thread is perfectly legitimate set of operations. If these operations are not supposed to be running simultaneously, that has to be enforced by the OS kernel. The "right" way would be for the OS to serialize these operations internally. But even returning an error for the write might be an acceptable (though not nice) way to handle it. What's absolutely not acceptable is randomly corrupting data.

nathanb · on May 21, 2015

I don't know anything about Windows driver programming so I don't know what the contract of the buffer flush operation is. Obviously, "sometimes fills your buffer with random data from memory pages owned by the operating system" is not part of that contract. This is a bug. I'm not trying to argue that it isn't.

All I'm saying is that no real, sane OS is going to be capable of protecting itself against every possible misuse. Remember that these are developers coding to an API we're talking about, not end users.

What if you have an API that takes an out pointer. I pass in data that I own, then I free that pointer in another thread. If I'm in userspace, I can blow up with SIGSEGV. If I'm in the kernel, maybe now you've just scribbled all over somebody else's memory. Shame on you for corrupting data.

APIs can always be abused. All the API developer can do is try to protect against obvious forms of abuse. I guarantee you every operating system that supports simultaneous multiprocess execution has some series of APIs that, when called in parallel, will corrupt your data.

Dylan16807 · on May 21, 2015

>All I'm saying is that no real, sane OS is going to be capable of protecting itself against every possible misuse. Remember that these are developers coding to an API we're talking about, not end users.

Only for a wide interpretation of 'misuse'. In the case you described, the kernel is still doing what it was asked. In the article, it wasn't. Situations where one command invalidates another should be very uncommon and very documented.

A real, sane OS should be resilient to any syscalls in any order without triggering internal bugs.

nathanb · on May 21, 2015

This may come as a shock to you, but operating systems have bugs. And nobody is trying to defend NTFS populating your file with random OS pages.

So yes, an OS should allow you to call its API without triggering internal bugs. But that's kind of a straw man.

There's an interesting line that you're exploring, though, and I would like to dig deeper. You said "the kernel is still doing what it was asked". Let's ignore the bug described by the article for a moment and look into this.

Let's say I have two threads. Each one does a write to the same LBA range. You wind up with a file that doesn't represent the full contents of either write (say it's a 8192-byte write, and your file has 4096 bytes from the first write and 4096 bytes from the second write). Do you consider that to be a bug in the OS or a bug in the application client?

(It's obvious that two conflicting writes will result in someone losing their data. The part of the scenario I'm exploring is that in this case everyone lost their data.)

Dylan16807 · on May 21, 2015

It's not a straw man, it's a description of the rule that was actually violated by windows.

In the absence of documentation saying otherwise, I would expect the calls to be serialized, and one to happen entirely before the other.

Now if they each did two writes at about the same time, it's no longer the OS's problem if the threads interlace badly.

chetanahuja · on May 21, 2015

"So yes, an OS should allow you to call its API without triggering internal bugs. But that's kind of a straw man."

That's not a straw man. That's the exact topic under discussion in this thread.

"Let's ignore the bug described by the article for a moment and look into this."

And that my friend is the definition of a straw man :-)

nathanb · on May 22, 2015

Wait, I think you and I are having very different conversations.

If the topic you're discussing is that the clients of an OS API should not encounter a bug...uh, I mean, I don't think you'll find anyone to disagree with you. It's not like Microsoft engineers put that bug in there thinking it would be OK. They even acknowledged that it was a bug...it just wasn't high priority because the caller (presumably; again, I know nothing about these syscalls) is not following best practices. Someone's pet bug got deprioritized. This is not news.

The topic I thought we were discussing was whether it's the OS's responsibility to prevent callers from misusing their APIs in a way that causes stupid things to happen, even when the caller is doing something stupid.

I think the latter is a more interesting conversation, but I apologize if I interrupted your discussion of the former.

chetanahuja · on May 23, 2015

"Wait, I think you and I are having very different conversations."

So it would seem.

"It's not like Microsoft engineers put that bug in there thinking it would be OK."

You say that now. But the reason my strident response was triggered was because of what you said in your original post that I replied to:

"I don't know what contract these operations have, but it may very well be the case that the user was violating the contract. If Microsoft responded with "don't do that", this may be the case."

That is quite absurd on it's face. I replied with how the contract was nothing more or less than the syscall API and there was no margin for negotiation on kernel's part when it comes to corrupting data when a user program calls those APIs in whatever order it pleases.

Subsequently, your arguments seem to have become more elaborate with a lot more caveats added. I think it has stopped being fruitful for me to respond at this point.

fl0wenol · on May 21, 2015

The bug wasn't just corrupting the data in the file after the weird sequence of API calls.

It was corrupting it with data from other files on the disk, which could have been sensitive.

Where this was particularly troubling was that an unprivileged user could then use this as an exploit/attack on the system to get it to leak pages of system files. This is where fun stuff like pass-the-hash begins.

nathanb · on May 21, 2015

My reading of the bug is that it would write random data from memory owned by the OS, not necessarily other files on disk. Certainly no better (and could very well be worse), but just a clarification.

(Going back and reading the post again, it seems that it's even worse than that, since a virtual environment hitting this bug could get data from the host. This opens an attack vector for a guest to bypass the hypervisor and compromise sensitive host data. I don't know if data from other VMs on the system could also be exposed in this way, but that would also be quite bad.)

Anyway, I think my rambling caused my point to get lost, because I wasn't trying to argue that Microsoft are justified in their "don't do that" comment. But no sane, realistic OS can prevent against every hare-brained thing a driver developer is going to do.