Stuff The Internet Says On Scalability For October 5th, 2018 - High Scalability -

 highscalability.com  10/05/2018 16:11:30 
  • Quotable Quotes:
    • DEF CON: A voting tabulator that is currently used in 23 states is vulnerable to be remotely hacked via a network attack. Because the device in question is a high-speed unit designed to process a high volume of ballots for an entire county, hacking just one of these machines could enable an attacker to flip the Electoral College and determine the outcome of a presidential election​
    • @antirez: "After 20 years as a software engineer, I've started commenting heavily. I used to comment sparingly. What made me change was a combination of reading the SQLite and Redis codebases" <3 false myth: code should be auto-explaining. Comments tell you about the state, not the code.
    • @ben11kehoe: Uniquely at AWS, S3 bucket *names* are global despite the buckets themselves being regional, hence creation is routed through us-east-1. I would be surprised to see similar cross-region impacts for other services.
    • @jpetazzo: "Windows 95: the best platform to run Prince of Persia... 4.5 millions lines of code, and today we can run it on top of a JavaScript emulator"—@francesc about the evergrowing size of codebases #VelocityConf
    • @lizthegrey: and then you have the moment that the monolith doesn't scale. So you go into microservices, and a myriad of different storage systems. These are the problems: (1) one problem becomes many, (2) failures are distributed, (3) it's not clear who's responsible. #VelocityConf
    • Tim Bray: I think the CloudEvents com­mit­tee prob­a­bly made a mis­take when they went with the abstract-event for­mu­la­tion and the no­tion that it could have mul­ti­ple rep­re­sen­ta­tion­s. Be­cause that’s not how the In­ter­net work­s. The key RFCs that de­fine the im­por­tant pro­to­cols don’t talk about send­ing ab­stract mes­sages back and forth, they de­scribe ac­tu­al re­al byte pat­terns that make up IP pack­ets and HTTP head­ers and email ad­dress­es and DNS mes­sages and mes­sage pay­loads in XML and JSON. And that’s as it should be. ¶
    • romwell: The IT 'cowboys' didn't ride into sunset. They were fired by banks which are/were stupid enough to cut corners on people who know things about their infrastructure before migrating to more modern systems.
    • ben stopford: The Streaming Way: Broadcast events; Cache shared datasets in the log and make them discoverable; Let users manipulate event streams directly (e.g., with a streaming engine like KSQL); Drive simple microservices or FaaS, or create use-case-specific views in a database of your choice 
    • phkahler: When you're really good you hide exploits in plain sight.
    • specialp: I don't understand what the operational burden is. We literally do nothing to our K8s cluster, and it runs for many months until we make a new updated cluster and blow away the old one. We've never had an issue attributed to K8s in the 2 years we have been running it in production. If we ever did, we'd just again deploy a new cluster in minutes and switch over. Immutable infrastructure.
    • arminiusreturns: It isn't just banks either. I'm a greybeard sysadmin type who has seen the insides of hundreds of companies, and for some reason in the late 2000s it seemed like everyone from law firms to insurance companies decided to fire their IT teams and then pay 3x the money for 1/16 the service from contractor/msp-types... and they wonder why they wallow in tech debt... I'd they even knew what it is, they did fire all the people telling them about it after all!
    • Ivan Pepelnjak: we usually build oversubscribed leaf-and-spine fabrics. The total amount of leaf-to-spine bandwidth is usually one third of the edge bandwidth. Leaf-and-spine fabrics are thus almost never non-blocking, but they do provide equidistant bandwidth.
    • Christine Hauser: Police Use Fitbit Data to Charge 90-Year-Old Man in Stepdaughter’s Killing
    • pcwalton:  "I've measured goroutine switching time to be ~170 ns on my machine, 10x faster than thread switching time." This is because of the lack of switchto support in the Linux kernel, not because of any fundamental difference between threads and goroutines. A Google engineer had a patch [1] that unfortunately never landed to add this support in 2013. Windows already has this functionality, via UMS. I would like to see Linux push further on this, because kernel support seems like the right way to improve context switching performance.
    • @krishnan: I have a different view. Even though edge computing to cloud starts out as a "client - server" kinda model, we will eventually see an evolution towards P2P cloud. There are tons of unused capacity at the edge locations and there is an opportunity to build a more distributed cloud
    • zie: For us, for most web stuff, that is mostly stateless, we scale by just starting more instances of the application. For stuff like Postgres, we scale hardware, versus doing something like citus or offloading reads to a hot-standby or whatever. It gets complicated quickly doing that stuff.
    • @lindadong: I’m sorry NO. One of the worst parts of my time at Apple was the toxic culture this line of thinking bred. Swearing at and insulting people’s work is not okay and is never helpful. Criticism for criticism’s sake is a power move. 
    • @danielbryantuk: "Ops loves serverless. We've embraced this mode of working for quite some time with PagerDuty and Datadog There will always be the need for ops (and associated continuous testing). #NoOps is probably as impractical as #NoDev" @sigje #VelocityConf
    • @danielbryantuk: "Bringing in containers to your organisation will require new headcount. You will also need someone with existing knowledge that knows where the bodies are buried and the snowflakes live" @alicegoldfuss #velocityconf
    • yashap: 100% this. I think people who have not worked directly in marketing dramatically underestimate just how much of marketing at a big company is taking credit for sales that were going to happen anyways. People have performance driven bonuses/promotions/whatever, TRULY generating demand is very hard, taking credit for sales you didn’t drive is WAY easier, and most companies have very feeble checks and balances against this.
    • @peteskomoroch: “[AutoML] sells a lot of compute hours so it’s good for the cloud vendors” - @jeremyphoward
    • Implicated: I spent 3-4 years deep in the blackhat SEO world - it was my living, and it almost completely was dependent on free subdomains because they ranked _so much better_ than fresh purchased domains. Let's use a real world example. Insert free dynamic dns service here - you create a subdomain on one of their 25-100 domains, provide an IP address for that subdomain and.. whala, spam site. So let's say we've now got spam-site-100.free-dynamic-dns-service.com - it's a record is pointed to my host, I'm serving up super spammy affiliate pages on it. I don't build links to it, that takes too much time and investment... instead I just submit a sitemap to google and move on. That's the short story. The long story is that I built hundreds of thousands of these sub domains for each service of this type I could find, on every one of the domains they made available. Over the course of time it became clear that the performance (measured in google search visitors) was VASTLY different based on the primary domain... to the point that I stopped building for all of them and focused on only a handful of highly performant and profitable domains.
    • Lyndsey Scott (model): I have 27418 points on StackOverflow; I'm on the iOS tutorial team for RayWendelich.com; I'm the lead iOS software engineer for @RallyBound, the 841st fastest growing company in the US according to @incmagazine
    • @jpetazzo: "Containers are processes,born from tarballs, anchored to namespaces, controlled by cgroups" 💯 @alicegoldfuss #VelocityConf
    • @BenZvan: If engineers were paid based on how well they document things they still wouldn't write documentation @rakyll #velocityconf
    • @tleam: @rakyll getting into one of the most critical issues with scaling systems: scaling the knowledge of the systems beyond the initial creators.  The bus factor is real and it can be *very* difficult to avoid.  Documentation isn't really an answer. #VelocityConf
    • @jschauma: You know, back in my day we didn’t call it “chaos engineering” but “committing code”; we didn’t need a “chaos day”, we called it Monday. #velocityconf
    • hardwaresofton: I absolutely love Hetzner -- they're pricing is near unbeatable. To be a bit more precise, I believe that they offer cut rate pricing (which is not a bad thing if you're the consumer) but not cut-rate service -- there is just enough for a DIYer to be very productive and cost effective. This gets even easier if you use Hetzner Cloud directly, and they've got fantastic prices for beefy machines there too -- while a t2.micro on AWS is ~$10/month on hetzner cloud CX51 with 8 vcores and 32GB of RAM & and 250GB SSD with 20TB of traffic allowed is 29.90 gbp.
    • halbritt: Scaling applications in k8s, updating, and keeping configs consistent are a great deal easier for me than using Ansible or any other config management tool. that (on top of k8s). As such, a developer can spin up a new environment with the click of a button, deploy whatever code they like, scale the environment, etc. with very little to no training. Those capabilities were a tremendous accelerator for my organization.
    • @bridgetnomnom: How do you run your first Chaos Day? 1. Know what your critical systems are 2. Plan to monitor cost of down time 3. Try it on new products first 4. Stream it internally 5. Pick a good physical location 6. Invite the right people for the test #VelocityConf @tammybutow
    • @stevesi: What are characteristics creating opportunity:• Cloud distribution, mobile usage • Articulated problem/soln • Solves problem right away • Network and/or viral component • Bundled solns bloated and horizontal (old tools do too much/not enough) • Min. Infrastruct Reqs
    • @johncutlefish: The fix? Visualize the work. Measure lead times. Blameless retrospectives. Psyc safety. An awareness of the non-linear nature of debt and drag. Listen! And deeply challenge the notion that technical debt can be artfully managed. Maybe? Very, very hard (13/13) #leanagile #devops
    • @lizthegrey: She set out to reduce outage frequency by 20% but instead had a 10x reduction in outage frequency in first 3 months by simulating and practicing until she got the big wins. 0 Sev0 incidents for 12 months after the 3 month period. #VelocityConf
    • @stillinbeta: Systems at Dropbox were so reliable new oncall engineers were almost never paged - chaos engineering provided much-needed practise! #VelocityConf
    • @chimeracoder: Hacker News: "I could build that in a weekend."
    • Yeah, but could you then operate it for the rest of your life? #velocityconf #velocity

    • Mark Harris: Swildens' research uncovered several patents and books that seemed to pre-date the Waymo patent. He then spent $6,000 of his own money to launch a formal challenge to 936. Waymo fought back, making dozens of filings, bringing expert witnesses to bear, and attempting to re-write several of the patent's claims and diagrams to safeguard its survival.
    • SEJeff: I'm a sysadmin who manages thousands of bare metal machines (A touch less than 10,000 Linux boxes). We have gotten to a point in some areas where you can't linearly scale out the app operations teams by hiring more employees so we started looking at container orchestration systems some time ago (I started working on Mesos about 3 years ago before Kubernetes was any good). As a result, I got to understand the ecosystem and set of tools / methodology fairly well. Kelsey Hightower convinced me to switch from Mesos to Kubernetes in the hallway at the Monitorama conference a few years back. Best possible decision I could have made in hindsight. Kubernetes can't run all of our applications, but it solves a huge class of problems we were having in other areas. Simply moving from a large set of statically provisioned services to simple service discovery is life changing for a lot of teams. Especially when they're struggling to accurately change 200 configs when a node a critical service was running has a cpu fault and panics + reboots. Could we have done this without kubernetes? Sure, but we wanted to just get the teams thinking about better ways to solve their problems that involved automation vs more manual messing around. Centralized logging? Already have that. Failure of an Auth system? No different than without Kubernetes, you can use sssd to cache LDAP / Kerberos locally. Missing logs? No different than without kubernetes, etc. For us, Kubernetes solves a LOT of our headaches. We can come up with a nice templated "pattern" for continuous delivery of a service and give that template to less technical teams, who find it wonderful. Oh, and we run it bare metal on premise. It wasn't a decision we took lightly, but having used k8s in production for about 9 months, it was the right one for us.
    • Demis Hassabis (DeepMind founder): I would actually be very pessimistic about the world if something like AI wasn't coming down the road. The reason I say that is that if you look at the challenges that confront society: climate change, sustainability, mass inequality — which is getting worse — diseases, and healthcare, we're not making progress anywhere near fast enough in any of these areas. Either we need an exponential improvement in human behavior — less selfishness, less short-termism, more collaboration, more generosity — or we need an exponential improvement in technology. If you look at current geopolitics, I don't think we're going to be getting an exponential improvement in human behavior any time soon. That's why we need a quantum leap in technology like AI.
  • « Go back