Hacker News new | past | comments | ask | show | jobs | submit login
AWS Fargate Deep Dive (learnaws.org)
153 points by aray07 on Sept 15, 2019 | hide | past | favorite | 91 comments



I love AWS, I really do and I thought about using fargate because the promise of not managing your “cattle-like” servers is wonderful but they need to get the pricing within this stratosphere for it to not be a complete joke.

I actually really like ECS and aware of how much time it would save me (a lot) and how much terraform I could delete (a ton) and it’s still not even close to worth it.

Amazon usually nails this sort of thing, surprising that despite the operational value it provides nobody seems to be using it.


To be fair, Fargate today is a lot cheaper than Fargate at release. The work on Firecracker had a significant impact I believe.

Still, I'm with you. I use Fargate for my open source project but I fully expect teams using it to eventually move to EC2 and manage the cluster directly in order to save money. It is easily, by far, the most expensive part of my pipeline.


I believe vendor lock-in and lack of incremental value over EKS are the reasons. The amount of flexibility we lose by using Fargate is not compensated well enough by the ease of use offered.


I PoCed EKS for our company. Significant issues we experienced vs Fargate were that EKS had no sane logging story out of the box--you had to build it yourself. Worse, EKS didn't integrate with IAM and we couldn't figure out a sane workaround. These issues are more-or-less addressed on the containers roadmap, so things are improving, but there's good reason to use Fargate. Also, I can't imagine feeling "locked in" to Fargate; porting to Kubernetes is pretty straightforward unless you depend on some isolation property provided by Fargate.


What lock-in? I'm genuinely curious, I can't imagine it would be hard at all to move my service from Fargate to anything else. It's just docker containers and DNS, I haven't done anything Fargate specific.


These days "vendor lock-in" seems to be code word for "have no idea how this thing works".


The vendor lock-in is having to code/deploy against AWS API instead of Kubernetes API. It is no different than coding to Solaris or windows API.

The problem with AWS (or any cloud provider) is that nobody can create its own cloud services. See the case of Mongo DB.

AWS is a closed source platform, which can be extended only by Amazon.

This will be more evident once more operators will be created for kuberentes.


What API am I coding against? I don't get it, I don't touch any Fargate API.

My Fargate cluster runs DGraph, an open source database, which my clients connect to the same way they would if it were on EC2.


Right, this is hidden from you via the AWS cli tools and UI. But, as in any OS, there is an API.

Yes, Farage does add the serverless features over EC2. However, you are in AWS ecosystem.


Sorry, but no? There is no API. My service runs in a container. It knows literally nothing about Fargate.

I use an open source database on it. My clients use an open source client to talk to that database, and rely on DNS for service discovery.

The only AWS specific APIs I rely on are for my eventing system - S3 + SQS, which sit in a generic library that I could easily swap out for fsnotify, rabbitmq, etc. None of that has to do with Fargate.


I assume what is referred to is rather of the "we have built all this internal tooling, release pipelines and processes, internal know-how even among non-developers" etc that any technology choice is accompanied by. These effectively lock you in and costs real money to replace.


I get that, although in many cases the AWS-specific tooling needed to build and deploy services running on Fargate is minimal.

It's basically logging in to ECR and triggering an ECS service update (2 API calls).

I voluntarily keep out the networking and IAM parts out, as I believe they are also needed for a Kubernetes cluster.

What I don't get is that many people seem to think that they are lock-in free just by choosing Kubernetes. But:

1. Kubernetes itself is a very, very opinionated piece of software. You are not free, you have to go the Kubernetes way.

2. The overhead of learning and maintaining Kubernetes is real, compared to a more managed solution like Fargate.

3. If you use Kubernetes, you still have to set up the underlying infrastructure. This infra is still vendor specific. If you go for a managed Kubernetes like GKE or EKS, you still have to deal with a vendor-specific API.


I would rather run kops and not GKE/EKS. Also, if needed, for the vendor specific infra we can use terraform.

I believe Kubernetes stack is more intuitive than any AWS (or GC or Azure) software. The open source approach will spur innovation that will outclass Fargate by miles. Fargate provides single instances, you would need networks/routes/Load balancers/auto scalers to make anything meaningful. I bet they will all be provided as AWS way of doing the same if Fargate gets any traction.


And you’re still locked in to your cloud provider with Terraform since every provisioner is cloud provider specific.


At least the code is uniform and extendable.


Saying your code is uniform because the syntax is the same is about like saying developing a website with Java and developing an Android app is “uniform”.


"Don't get locked up into avoiding lock-in"

https://martinfowler.com/articles/oss-lockin.html


""" Gregor Hohpe is a technical director in Google Cloud’s office of the CTO. """ He definitely has the intent to market.


Seems like vendor lock-in is the new term for "Not Invented Here".

I also fail to see how Fargate locks you in: as you say, it's just docker and DNS, with a sprinkle of environment variables if you use it with SSM Parameter Store.


The docker containers are the least complicated bit in production deploys. Orchestrating all the up/down-scaling and releasing, binding of IP-numbers, DNSes etc.. that's where the potential "lock-in" comes (although I would say from my experience that the main problem with Fargate is it's several times more expensive than simply putting ECS on spot instances).


Fargate provisioning is tied to AWS, unlike Kubernetes or other systems that aren't Cloud Vendor specific. If I build tooling around Kubernetes, I can move from Google to EKS without needing to change it. However, if I build tooling around Fargate, it's useless outside of AWS.

Keep in mind, this doesn't make it bad or impossible to move out. Just more work. That all being said, I still like Fargate even with it tying me to using AWS. I think the pros out-weight the cons.


Realistically, if you are running at a scale to actually need K8s and not just doing Resume Driven Development, what are the chances that your CTO is actually going to risk moving their entire infrastructure to another cloud provider?


You’re always locked into your infrastructure once you get to any sort of meaningful scale and when you are trying to “prevent lock-in” you’re probably spending more time in engineering effort instead of taking advantage of your vendors feature.

That’s just like all of the software developers using the repository pattern just in case their CTO decides to move from their six figure Oracle installation to Postgres.

The cost of moving away from your chosen vendor is usually not worth the cost, doesn’t have the ROI, and not worth the business risks of regressions.


Disclosure: I work in AWS.

First thanks for the feedback. We are listening closely and feedback (critical or otherwise) really helps us focus on what to build.

We think of Fargate as simplicity, without compromising on capability. We don’t want price to be a barrier for customers to realize simpler operations models. We are thinking through additional pricing options for customers. We use open source where it matters for our customers and partners. For instance, we recently launched a new capability called FireLens (in preview) that uses Fluentd and Fluent Bit to help partners use a standard codebase and to help developers realize cost savings on logs.

WRT lock in comments on this thread, I was a startup product manager in my past life. What really mattered to us was speed of execution and keeping our costs in control. Choose a product that helps you realize those goals (if they resonate with you), whether it is Kubernetes or ECS/Fargate.


I wish my company could use it, we run a bunch of standard up ECS clusters, and it'd be nice to not have to provision and maintain servers. Fargate wasn't worth it due to the price difference, and just how rigidly they enforce cpu and memory quantization.


Pricing is fine for small things (most people's side project, etc). Fargate still feels very young and I don't think it handles very high scale or resource intensive work all that well/affordably. I think that will come in time.

From a cloud provider PoV, Fargate is a very hard problem - like Lambda except harder because the container might need to run forever.


I did an evaluation at release time, and as you said, the pricing model was a complete joke.


We spend ~900$ a month on fargate to run our of our dev, stage, qa, and prod environments as well as some other services and sqs consumers. After the recent price decrease we looked at how much reserve instance would save us and the few hundred in savings would not make sense vs the over provisioning and need to dedicate resources to scaling and new tools to monitor individual containers.

Note: do have some stuff in lambda but its package size restrictions limit us.


Your packages exceed 250mb??? Wow



My docker containers average about 5gb total. And I have lots of them. It is incredibly easy to become bloated. It is hard to stay bloat free


We're talking about Lambda tho, executing a function, how much code is required to execute a function... The largest .net lambda I've managed to create is ~40mb which included chromium.


While Lambda is sold as “functions as a service” I have on several occasions made the particular function being served the router for a Ruby web application - with the Ruby runtime there’s really nothing stopping you running a complete Rails app in Lambda other than size limitations once you add a few too many gems.


Well it depends what that function is doing. Do I need a full word2vec mapping? I've got small bits of data that would blow past that limit compressed despite the work I want done being a single easy function call.


You're getting downvoted but clearly there is something wrong with the way these devs are using lambdas.


Definitely have come up against package limitations for Lambda with Java big data applications in a high security environment.


Does the 250mb include reading in packages from s3 during execution?


No but there is a 500mb ephemeral storage limit. Reading files from s3 is also very inefficient since they aren't cached at the worker level.


For people who worry about security (either sincere or tick the box types): what are the pros and cons of managed containers? It seems like you get a reduction in attack surface but also have fewer tools at your disposal


One of the key things to remember about Fargate is the following.

Each Fargate task has its own isolation boundary and does not share the underlying kernel, CPU resources, memory resources, or elastic network interface with another task. (Source: https://docs.aws.amazon.com/AmazonECS/latest/developerguide/...)

The other part is patching. We are (I work at AWS) responsible for patching the underlying hosts. More details at https://docs.aws.amazon.com/AmazonECS/latest/developerguide/...


> Each Fargate task has its own isolation boundary and does not share the underlying kernel, CPU resources, memory resources, or elastic network interface with another task.

Isn't Lambda the same as they're both using Firecracker under the hood isn't it?


With or without Firecracker the isolation models are similar.


Yes


Fargate solves problems like patch management, but there are other issues. If an attacker were to compromise a Fargate service they'd have less to work with, but otherwise there are a lot of advantages they'd have over EC2.

On EC2 I can deploy monitoring tools like OSQuery, or rely on audit/ other OS subsystems to monitor for suspicious behavior. On Fargate I don't think anything like this is possible.


What security problem would you be addressing with osquery that Fargate doesn't already address? If you're worried that AWS is owned up, it doesn't matter which AWS service you use.


Attacker owns my service, spawns a shell. OSQuery will tell me that a new process is running. Fargate will not tell me anything, to my knowledge.

OSQuery as a sidecar might work, if you stuff it in the container, but I doubt it can access the audit subsystem and I'm unsure if it's really been tested in that sort of environment.

I'm not concerned about container escapes or the underlying host being owned.


My take: you can run osquery in a container, but people don't, because osquery does a lot of stuff that doesn't make sense in a container environment, where you start from minimized, fine-grained control over the environment (honestly, I see a lot more productionized osquery in endpoint security than I do in serverside security). Anything you're trying to accomplish with osquery in a Fargate container, you can presumably accomplish with it or something else.


OSQuery was merely a strawman, I also mentioned the audit subsystem, but these are just implementation details.

I can not easily, in a supported fashion, track things like process executions or file interactions in a Fargate container. Maybe OSQuery could run as a sidecar, maybe auditd is actually exposed to the container, I honestly don't know.

The end result is that companies leveraging things like aws lambda or fargate are also likely giving up instrumentation that they would consider standard on EC2.

I don't think this is really controversial to say. You can absolutely justify to me that instrumentation is not worth 3rd party patch management and a nice ACL system etc.


I was thinking of this very limitation when posting the original Q. If e.g., there a RCE issue with my PHP server on Fargate, as the attacker I have a foothold and there's no monitoring inside the container as I try to move laterally. I guess the same is true with EC2 backed containers though


You can instrument the container itself (the environment in which RCE against PHP would provide an attacker).

You can't instrument the container engine or the host server, because AWS owns the security of those. But AWS will do a better job with those than you will, or at least, your whole usage of AWS is premised on that.


How often do people actually instrument their containers? I guess my original premise is wrong if you can simply set up the same monitoring inside your container


You can't do the same things; for instance, some container instrumentation schemes want an LKM to be resident, and that obviously won't work on Fargate. But you can generally accomplish the same goals.


Don't know from a Fargate perspective but Elastic Container Service (ECS) deploys EC2 servers that do not pass the CIS Benchmark. I don't believe you gain much from a security perspective.


You can roll out your own EC2 instances and Auto Scaling Group for ECS and control the security on them yourself.


This is incredibly true. The only requirement for an ECS cluster member is to be running the ECS agent - which is a Golang binary.

You're free to run a CIS hardened image if you desire to do so.


This is how we roll. CIS as base, Packer to customize (ecs agent, docker) into own AMIs.


Are there OSS or commercial AMIs that have been hardened? Maybe some RHEL or CentOS?


Yeah, if you look at the AWS image marketplace then you'll find some.


For some odd reason, I find the AWS marketplace a bit suspect looking. Not saying that it is, just that that's my impression of it.


I would love a managed Kubernetes Deployment/Job/StatefulSet. Forget managing the cluster or the node, just allow me to "apply" a Deployment config with associated Service straight to the cloud. I will tell you my resource limits bill me accordingly.

I hope Google Cloud or AWS is working on that. That would have a much wider impact then Fargate.


Check out Google’s CloudRun https://cloud.google.com/run/


There is the old AWS Beanstalk https://aws.amazon.com/elasticbeanstalk/


And have unlimited namespaces, letting you choose the same as another deployment you have or a new one.

Make it have a default deny all NetworkPolicy and you are all set.

I'd love to see this too.


I’m using Fargate for services that are CPU intensive (i.e. 24/7) and not reactive by nature. It’s been a good experience so far.


For my poor brain still trying to cope with the enslaught of the huge number of all these cloud service features ... this sounds a lot like kubernetes ... is this just a proprietary version of that? Can someone differentiate them for me?


It's a fully managed version of Amazon ECS (elastic container service). With Fargate, you don't need to manage the EC2 instances that make up the cluster, as is required with ECS.

Early on, Amazon tried to avoid offering a managed Kubernetes service, and so they rolled their own container service in the form of ECS. Later they caved in and created EKS, their Kubernetes platform. ECS is still used as the underpinnings of some of their other services, such as Batch and Fargate.


It's not 100% fully managed. You still need to set auto-scaling rules. I find that mildly annoying.


When I played with Fargate about a year ago or, this was its Achilles heel. I was hoping for a solution that would auto-scale very quickly, but the healthcheck intervals and minimum counts to consider a container "running" couldn't go below 30 seconds or so.

As far as I recall, this is all configured on the load balancer, rather than directly inside Fargate. Somehow makes it feel less like a "fully managed" solution, but rather something you have to still tinker quite a bit with.

(compared to Lambda, which you really don't have to worry about scaling at all)

EDIT: [0] indicates that the minimum you can set is 10 seconds (minimum 2 intervals of 5 seconds to consider it "healthy"), if I understand it correctly

[0] https://docs.aws.amazon.com/elasticloadbalancing/latest/appl...


GKE and Google Load Balancer has similar issues, changes to the loadbalancer/ingress take up to 10-15 minutes to propogate and if you miss a health check you'll be serving 500's until it magically balances itself out or you just nuke it out of frustration and wait the 10 minutes for it to configure with your ip again.


My understanding is with EKS or similar you still have to manually size your kubernetes cluster, that is you have to make sure there's enough hardware instances in your kubernetes cluster for whatever scaling you'll need. With fargate you're effectively using AWS's own cluster. Your service can scale up and down, and you only pay for the resources that your service actually uses.


Fargate lets you run containers without having to think about the underlying VMs. You say how much CPU and mem you want the container to run with and off it goes.

Kubernetes is a container orchestrator. If you have a k8s cluster, you can launch a container with a specified amount of memory (and I believe CPUs) and it will look much like Fargate. However, someone needs to manage the underlying hardware/VMs for the cluster as well as the cluster software/updates (k8s version, various k8s addons/operators, etc). With k8s, you will also need to make sure that you are using resources effectively (total resources required by containers == total resource available on the VMs in the k8s cluster), which is particularly challenging if you are launching new containers often. If you are using a cloud k8s offering (EKS, AKS, GKE, etc) then some or all of the VM management will be handled by the cloud provider, but the software management and resource utilization work will still be up to you.

TLDR Fargate and k8s can be used very similarly, but k8s has a much higher ops/management burden. K8s was designed to do many more things than Fargate and, while that is sometimes great, it comes with a large ops/complexity cost.


Thanks! so confusing trying to keep track of all these different services.


> this sounds a lot like kubernetes ... is this just a proprietary version of that?

Not really, in fact, they announced EKS (their Kubernetes service) on Fargate at the same time they announced EKS, even though they've since seemed to have abandoned it; conceptually they live at slightly different levels of the stack.


I like using Fargate for one-shot tasks that are easy to split up. I used it a couple of times for summary tasks on large batches of satellite data (100s of GBs). Set up a docker image that takes the month for which to do the analysis as environment variable and then launch 50 or so Fargate tasks in parallel. Fairly easy to set up and can save quite a lot of time. If it's for short running jobs the increased price is not much of an issue. For more complicated, long-running services I feel like I would prefer managed Kubernetes.


What is the advantage of using Fargate instead of Batch for this use case?


I wasn't aware of AWS Batch. Looks like that might actually be more suitable for this use case, so thanks for mentioning it :)


Can I easily SSH into or otherwise interactively get a shell into a Fargate container? I think this is a minimum debuggability requirment for these kinds of services.

https://github.com/aws/containers-roadmap/issues/187 sounds like the answer is "no"?


The short answer is no. The long answer is that you could theoretically explicitly set up a user for auth, run sshd in fargate containers, and shell in, but it’s not going to be worth it as anything other than a toy example.


I wonder what's behind this functionality gap in the managed services. After all we take docker exec, podman exec, kubectl exec etc for granted in troubleshooting.


if you need to ssh in to a box you are not ready for fargate or lambda. think about what commands you would actually run on the shell. what are you looking for or reading? you can find all that information from cloudwatch or other logs / metric services. you'll find at scale, sshing into a particular ec2 box also doesnt make sense


Sooner or later you end up needing to debug in production to narrow down phenomena that for some reason don't replicate. Repl du jour, aws cli, lsof, tcpdump, netcat, perf, netstat, iptables, poking around in /proc, valgrind, pdb, replacing binaries with shell scripts that log their args, rerunning various subsystems from the shell with debug/verbose switches, retrieving badly configured core dumps etc. If you've been spared from these kinds of issues, be thankful for your so far sheltered life :)

If your problem is reproducible at will in prod (big if), theoretically you can bake each thing you try into your container build and debug its automation at each step, often slowing the debugging process down prohibitively.


This product always makes me think of Aqua Teen Hunger Force. I wonder if that's where the name originated.

https://youtu.be/uOd7HQoKxcU?t=50


Is EKS a serious competitor to this? It seems like it would be and the bonus of no lockin. What's the advantage of Fargate over EKS?


I still haven't managed to SSH into a container (for Django). The best way I guess is SSM (systems security manager) which at least gives a web based console.

codepipeline integration was time consuming to set up. You have to get it to create a json file with the image id and uh I'd have to consult my notes.

All told, it was more complicated to set up than I expected.


We’re working on this at my workplace as a pre-requisite to a widespread deployment of Fargate, its the only blocker on moving out of Heroku for a large distributed system.

The approach I’m going with is to have an EC2 host attached to the ECS cluster which people can schedule interactive tasks on. Coupled with some scripting (maybe a Lambda function if I decide to get fancy) we can then start a task for any given service on the instance with the same environment and IAM role, but with a command like /bin/yes just to keep it alive. Once that’s running users can SSH to the host instance and docker exec into the container for whatever command they actually wanted to run.

It’s quite a bit more involved than Heroku’s run command, but initial prototypes seem to indicate it’ll work once we wrap it in some tooling.


In general, you should just start with whatever service AWS has that integrates the most features, and once you know what your technical requirements/limitations are, you'll know if you need to back up to a less integrated solution. Worst case, you're paying too much for a solution for a short time, but you have a working MVP.


Does anyone remember that time at Re:Invent 2017 when they announced Fargate, and said that Fargate was coming to EKS "soon"? Let's put odds on which is released first: Fargate for EKS, or Half Life 3.


NodeChef is a good alternative where you don't have to do the tedious job of managing servers. https://www.nodechef.com/


Looks nice. But I put in 2 apps with 4 cpu and 1gb of RAM each. And I selected MySQL database with 1gb of RAM and 30gb storage.

And it quoted me $400/month.


Are they allowed to host a MongoDB for you? Isn't there something in the license about that not being allowed?


Not of the versions they offer.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: