Ethereum & The Ethereum Foundation

The Ethereum Foundation’s mission is to promote and support research, development and education to bring decentralized protocols and tools to the world that empower developers to produce next generation decentralized applications (dapps), and together build a more globally accessible, more free and more trustworthy Internet.

– https://ethereum.org/foundation

Ethereum is a decentralized platform that runs smart contracts: applications that run exactly as programmed without any possibility of downtime, censorship, fraud or third party interference.

https://ethereum.org

Ethereum & The Ethereum Foundation

Metropolis, Serenity & The DAO

There has been much debate and significant coverage of The DAO Attack however, no surprise, more questions have been raised than answers. There isn’t an easy answer because we can’t predict how people and markets will react to the proposed solutions, and even if this entire anomaly was orchestrated, the exposure of Ethereum combined with its autonomous nature makes controlling an outcome unlikely. Alas, a network decision must be made and I believe there is a perspective which is overlooked, yet important to recognize.

First let’s take a peak at what the future holds for Ethereum. Specifically, EIP#86 is a set of changes proposed for Metropolis. Here is a high level summary of what has been proposed:

  • Some block data will be moved into Ethereum’s “World State”
    • Today, blocks are stored outside the data structure which represent the Merkle state root.
    • This is the first step towards moving all state data accessible by the EVM into the world state.
  • Medstate removal
    • Today, all transactions are executed serially and after a transaction is executed the state root is calculated and stored in the transaction receipt.
    • This collapses state transitions into blocks, which increases efficiency and is a step forward towards parallel execution of transactions.
  • “we take initial steps toward a model where in the long term all accounts are contracts

The proposal, posted by Vitalik in April, is an important change that moves toward all accounts having code. When all accounts are code, state transitions such as balance transfers are initiated by account code instead of the underlying implementation of Ethereum.

This a win for censorship resistance and it means that in Metropolis we will have a high level account interface which is vetted and capable of employing fail-safe mechanisms such as multisig.

With respect to the attack, principles, immutability, code vs consensus, etc., the most pragmatic path forward becomes clear when we also consider the currency and crypto abstraction in Serenity. From EIP#101:

2. Moving ether up a level of abstraction, with the particular benefit of allowing ether and sub-tokens to be treated similarly by contracts

In Serenity a token contract like The DAO will be a first-class account in Ethereum. Naturally, if the functionality of token abstraction is compromised, the immediate course of action would be a hard fork. Fortunately by the time we get to Serenity, we will have battle-tested and formally verified code to power these first class token accounts.

How Metropolis and Serenity are implemented isn’t final and its not clear how the market will react to a hard fork, but one thing is clear – the future protocol of Ethereum is inclusive of secure token contracts.

 

 

Metropolis, Serenity & The DAO

Defending Bitcoin

Today, as a result of the Internet and online social networks, the velocity of information between people has become strikingly fast. One immediate impact of this is that gauging the quality of information is limited, because — well, there may only be 2 people in the world who know exactly what is truth. In fact, 1 person can create a trend which affects a billion people in a couple of minutes. By consuming information in this manner, it seems that sharing information is starting to become a cult trend, rather than actual communication meant to convey a point or serve a purpose. But can such trends die as fast as they started?

One trending form of information is concealment. Bitcoin seems to be a recent example of a disparity between what is real and what is a trend. Is the value of Bitcoin based on something real or a trend? Prior to Bitcoin, there was Tor, BitTorrent, etc. … all the way back to dial-up BBSs. To this day, none of these have prevented law enforcement from going after criminals, yet, they continue to be havens for citizens to attempt to conceal their actions.

Bitcoin is a computationally private and distributed cryptocurrency that enables concealment of information in a way which wasn’t previously possible. Bitcoin is not anonymous, it is not a generally accepted currency, and I think Bitcoin’s scalability is ultimately limited by government impositions (or the lack thereof) on network packets.

The strongest counter to Bitcoin’s scalability is that while Bitcoin itself maybe robust, it requires secure network connectivity, which as everyone has finally realized, is not robust. A majority of the Internet connections throughout the world are through cellular or wireless connections which, in almost all cases, are heavily regulated by governments and legally restricted by the wireless carriers. You can’t just connect your computer to your cell phone and download free movies and music — you probably won’t get past a GB until you’re throttled, and in about 3-6 months you could end up in court.

A cell phone company has it’s subscriber’s identities, can see what connections are used for, and will follow the law — a copyright holder can file subpoenas and federal and state regulations make them subject to the will of several governmental bodies.

Simply put, state control over major ISPs could immediately cripple Bitcoin. Regardless of it being “distributed” the fact that Bitcoin depends on a majority-voting and settlement system means that it will slow to a halt of too much latency is added to the network. Major ISPs only need to inject routes which triple latency and, at scale, Bitcoin transactions would trail the market in a way which would lead to arbitrage capable of pricking the perfect hype bubble which Bitcoin has created.

Let’s remember folks — the US Government funded, invented, and subsidized most of the infrastructure and technologies which Bitcoin depend on.

Maybe the rest of the world isn’t as eager to file lawsuits against citizens for copyrights. But when there is a system which threatens the stability of commerce and currency — most governments, unlike copyright holders, can act first and ask for forgiveness later.

Defending Bitcoin

Locking down Facebook Connect

UPDATE #2 (10-Oct 2010): Recently  there’s been a lot of talk over session hijacking, thanks to Firesheep and github. Dang. I liked the term fb-yelp-gibbed. Considerations below still apply.

UPDATE: After conversations with a friend, I made a few changes. Specifically, the fbuid is usable on your site, just don’t use it together with the JS library and don’t trust the browser.

User privacy is non-negotiable and developers should be as responsible as Facebook.

How to secure your FB Connect Implementation (so your users don’t get fb-yelp-gibbed):

OLD REST API

  1. DON’T use the JS library (violating this amplifies your users’ exposure; see EXCEPTION below)
  2. Push all FB connect requests through your backend
  3. DON’T STORE a userid or fbid in a cookie (only use fbuid client-side for externals; server should never trust browser-supplied fbuid)
  4. DON’T STORE your app’s FB API “secret” client-side (in javascript, in device app, etc.; NO EXCEPTIONS)
  5. DO store your user’s fbid and/or userid, only, on your server
  6. Never give client-side (JS, scripts, etc.) access to userid or fbid

When appropriate, verify the FB user is who they say they are by using auth.* methods, linked below; if you’re not sure what these do or what they’re for, give yourself 2-4 weeks to understand the ins and outs. OR, See OAuth comments below (and transition to OAuth).

http://developers.facebook.com/docs/reference/rest/auth.getSession

For iPhone/Android, learn how to proxy FB connect requests so you NEVER store your API “secret” on the phone.

The only communication between your users browser or device and your fb-app should be whether or not the user has been authenticated. Even then you should also utilize the rest/auth.* (server-side) methods to ensure the user actually authenticated.

NEW OAUTH API
Same as above. NEVER send API calls from JS in the browser! Read the authentication guide and understand every concept:

http://developers.facebook.com/docs/authentication/

EXCEPTION
The only exception here is if there’s ZERO user-generated content, ZERO 3rd-party HTML, ZERO 3rd-party JavaScript on a page, and everything the page and it’s assets are all sent via SSL. Even then, you’re at the mercy of the users desktop — don’t store userid, fbuid, or api secret anywhere on the client (in code, cookies, etc.)

The other exception here is if you really know what you’re doing and you’ve been dealing with XSS and browser authentication for a decade. In that case, I’m sure all of your application’s assets are served statically (or through SSL), your JS is locked down with a fine-tooth comb, you don’t let any advertisers or user-content sneak in HTML or JS, and you don’t store your FB API secret on the client.

WHY?
This is serious business. Privacy is priceless. Facebook Connect, despite how folks feel, is more secure than many banks. However, their crutch on letting developers do everything with JavaScript, and browsers limited support for security (injecting JS is like godmode in Doom), have put Facebook at the forefront of all of our security misgivings.

BUT WHAT ABOUT PRIVACY / PAI
A site with a significant user-base and an improper FB Connect implementation will, by proxy, give an attacker delegation to all of the private data that site has access to. Digg being hacked = digg FB users exploited, Yelp exploited = Yelp FB users screwed — you get the idea.

Please, don’t be that site. It’s easy to blame Facebook, but, all they’ve done is made public data public.

Locking down Facebook Connect

Amazon RDS: Poison or Pill

As soon as read the AWS newsletter about Amazon RDS, I started looking for a Megaphone to start shouting at folks – keep away! Amazon RDS or Relational Database Service places Amazon into the mire of shared hosting and AW users into a position of false confidence. Harsh words considering, overall, I feel Amazon’s service offerings are best-in-class. AWS offerings have historically pushed the envelope with regard to practical usage-based computing, something which ancient providers such as Sun and IBM have attempted to accomplish for decades; in this case I define practical as both usable and cost effective for small and large tasks. Up until now such systems weren’t trivialized to x86 hardware and required special programming considerations, access to academic institutions and/or a large budget. By combining SLA-supported x86 virtualization alongside application services such as S3, SQS, and SimpleDB, AWS has provided a usage-based on-demand computing solution which is simpler than task-based computing and as secure and reliable as virtualized or shared hosting. With it’s on-demand nature AWS is a cost effective for everything from small tasks to those requiring a datacenter of processors.

So why is Amazon RDS so bad, so much that you shouldn’t use it?

Well, there’s not an easy answer, the better question is to ask yourself: Why do you think AWS will be better than your own MySQL deployment? There is no right answer because almost any answer will probably, one day, bite you in the ass. Hard. I mean data loss, and it won’t be Amazon’s fault.

RDBMS systems and applications which depend on them are built from the ground up to rely on persistence, integrity, and static data models (schema). In contrast AWS has been built for distribution, decentralization, and the “cloud”. For Amazon, this service is somewhat of a U-turn from their original direction and has also placed a stamp on their forehead which says “That MySQL Guy” which is not good — I have nothing against mysql, however, as a de facto entry-level (free open source) software, it has accrued a strong following of immature software. Such software has nothing to do with the basic purposes of AWS or MySQL but has everything to do with how Amazon’s support and engineering staff will be spending their time which is supporting users and software which aren’t built for the cloud.

I hope that RDS won’t be a situation of butterflies & hurricanes but here’s a quick list of why the relative cost of RDS is high both for Amazon (the company) and all of it’s AWS users:

  • Cost for Amazon (operations, engineers, and products)
    • MySQL, like most open source systems, has been historically buggy software with a trailing release+testing+production schedule which requires continuous testing between production releases for large deployments (such as RDS).
    • MySQL has a large set of features which vary across releases and which share equal presence in production; in other words, Amazon will need to cater to providing production support for multiple versions, not just the latest stable version.
    • Amazon has no control over features and capabilities of MySQL and is thus limited to what MySQL provides; while MySQL provides many “good things”, Amazon will still be obligated to maintain through the bad. This is a shared disadvantage of AWS Map Reduce via Hadoop however, those are mostly mitigated because Map Reduce is such a low-level distributed system.
    • MySQL is very flexible and itself scales very well however it doesn’t do so by itself and requires a significant effort to be properly configured for the data being managed. All the folks who don’t know this will default into thinking Amazon will do this for them and will be disappointed when it doesn’t “just work”. Whether they ditch RDS or bug Amazon’s support, either way, it’s not a positive situation.
  • Cost for AWS (primarily EC2) users
    • Potential degradation of service and support for EC2 instances
      • With RDS available Amazon can defer issues with regard to running MySQL on EC2 instances to a recommendation for RDS — this will be a terrible waste of time for both parties.
      • MySQL is a very centralized system and by transitioning the decision of where MySQL resides in the AWS cloud from the user to Amazon, Amazon will be further centralizing the impact of MySQL on the cloud. Whereas users will randomly have MySQL deployed across any EC2 instance, Amazon will be appointing MySQL to specific hardware; this is based on the assumption that Amazon is clustering RDS deployments onto local hardware and not randomly deploying instances in the cloud. This is somewhat of a compromise for security and adds significant SLA risks (read: cost) to Amazon. In short, when a MySQL cluster dies – a LOT of folks are going to be VERY unhappy – their support tickets will be a burden to staff and their requests for credits will be a financial cost. Moreover, support staff will be yielding priority to these customers over other services because of the implicit severity.
    • Increased cost
      • RDS instances cost >10% more than regular instances and only come with the added benefit of backups — something which every system should already have in place. If you do choose to delegate the task of backups to RDS, you’re paying extra for a task you’ve already thought about doing yourself.
      • Cost of keeping your database, it’s backups, and it’s history all within AWS is multiplicative and if you grow to the point where you’re ready to move off you’ll be charged to transfer all the data to an external system. While this is a subjective cost it’s still worth pointing out; if folks aren’t already doing backups right, they’ll likely not know that cost effective database backups make use of binary logging facilities, not filesystem snapshots, and use significantly less disk space (and thus I/O).
    • False confidence
      • As I’ve mentioned before, letting other folks control your backups for you is a mistake. Failure is a matter of when, not if, and you’ll be in better control of responding if you understand what you’re dealing with. Just because RDS is doing you’re backups doesn’t mean you’re safe.
      • RDS users will expect MySQL to scale on-demand as everything else works that way with AWS and it’s just not that simple. Scaling a database requires analysis and a balanced combination of server settings, data normalization, and indexes; all of these things will still be the user’s responsibility and Amazon’s solution of “throw hardware at it” is a haunted path to send it’s users down.

Overall, I feel that Amazon could quickly cannibalize the value and quality of AWS if they (continue to) introduce trivial services. Supporting open source software they have no control over is a significant increase in relative support and operations cost. Amazon seems to be approaching this by making the cost of RDS instances more than EC2 which is a mistake because the real cost is the lost opportunity of engineers spending their time on systems which are more efficient for cloud computing – Amazon could charge 3 times an EC2 instance and their engineers would still be better off building technologies for cloud-based systems and not centralized RDBMS-dependent web applications.

Where I feel Amazon has fallen short the most, is that RDS only provides single-instance MySQL support and nothing more. No load balancing, replication, Hadoop integration, or any other form of data abstraction which could make it functional in a cloud computing context. Not implementing these features is a very clear indicator that AWS is focused more on short term revenue generating feature rather than cost effective cloud computing systems or improving the shortfalls of legacy centralized system.

With all this said, I have to consider the possibility of this being a good move for Amazon. I present the potential issues with RDS simply to warn folks from relying on it as a crutch, and, to point out the new direction AWS has veered is into choppy waters. There are several aspects of RDS which will give Amazon insight into correlations among and between the varying systems of data storage and processing – comparing SimpleDB, MapReduce, MySQL, and general resource consumption could shed light onto how their cloud is being used at a higher level than processors and bandwidth. Last, Amazon might be aware that MySQL is a crutch and is putting the service out there as a way to wean folks off of centralized systems.

Amazon RDS: Poison or Pill

restful-authentication + subdomain-fu = needing cookie adjustments

I’ve perused several posts about handling cookies when multiple subdomains are involved however, the solutions were either for older versions of rails or didn’t resolve my situation; we wanted to have a cookie which could be used among all subdomains. This might also give you some insight as to why restful-authentication doesn’t have a feature to do all this for you — it keeps changing and by-hand is best for now. If you’re employing this, do be diligent with security; sharing credentials across domains can be risky business if your security varies across domains.

To do this, first edit config/initializers/session_store.rb where you’ll want to add the key:

:domain => ‘.example.com’

The format here is important – if you don’t prefix the domain with a period the cookie (and session) will not apply for requests to subdomains. This covers the rails session — however we also need to cover the cookie set by restful-authentication which you’ll find in lib/authenticated_system.rb. In the kill_remember_cookie! and send_remember_cookie! methods insert same key as above or a reference to the session_options key. It’ll look like this:

def kill_remember_cookie!
  cookies.delete :auth_token, :domain => ActionController::Base.session_options[:domain]
end
def send_remember_cookie!
  cookies[:auth_token] = {
    :value   => @current_user.remember_token,
    :expires => @current_user.remember_token_expires_at,
    :domain => ActionController::Base.session_options[:domain] }
end

During development you should be aware this might not work using ‘localhost’, depending on your OS. The best thing to do is to edit your hosts file to have “example.local” point to your machine and use those domains for testing instead.

If you’re doing anything more complicated, you’ve got your work cut out for you as you may need to write custom rack middleware (see: Google) and/or use a Proc. In the latest Rails, cookies are being handled by Rack (instead of CGI); in any version, setting Cookies via cookies[:key]= is performed independent of the session options which is why you must specify the domain separately. There are some folks who describe monkey patching Rails to set the domain automatically but this is unreliable as I believe it’s changed every release. If you don’t want to have to change it, just create a wrapper method for setting your cookies, or, set the domain wherever you set or delete a cookie. We only set one cookie via restful-authentication so 2 lines is a fairly simple fix.

restful-authentication + subdomain-fu = needing cookie adjustments

Ruby EventMachine :gt Python Tornado, Twisted

I was working on a project looking to see if Ruby was good enough for responding quickly to HTTP requests. Good thing it, along with Python, and every other language, plays well with C/C++. Anyways, EventMachine apparently blows away Tornado and Twisted. I only tested Tornado because it’s faster, right? What I really wanted to test was if either of these would fall apart under high concurrency or load. For the “Hello World!”, they both survived although as you can see for Tornado, response times became an issue earlier. I’ve also provided ‘ab’ for reference – it’s a little more specific with regard to response times. Clearly both of these are hitting a CPU ceiling – with Tornado hitting it faster. Ftr, I tested on a dual-core 2.33ghz xeon w/RHEL5, python2.6, and ruby1.8.5.

Along my adventure in this hnews thread, I came along this most awesome post: Twisted vs. Tornado: You’re Both Idiots

Anyways, what I’m happy about is there’s a Ruby option for a fast little server which pumps out a bajillion requests per second if you’ve got a farm of servers and it won’t fall on it’s face. Also, I don’t have to use Python and EventMachine is a BREEZE to use. What does suck is the EM HTTP server isn’t RFC compliant but that’s probably just a matter of time and I won’t be using HTTP anyways. ymmv

httperf: Tornado

[root@mail ~]# httperf --port=3002 --num-conns=1000 --num-calls=500 --rate 100 -v
httperf --verbose --client=0/1 --server=localhost --port=3002 --uri=/ --rate=100 --send-buffer=4096 --recv-buffer=16384 --num-conns=1000 --num-calls=500
httperf: maximum number of open descriptors = 1024
reply-rate = 5045.8
reply-rate = 4868.5
reply-rate = 4905.4
reply-rate = 4846.9
reply-rate = 4938.4
reply-rate = 4747.3
reply-rate = 4800.2
reply-rate = 4795.6
reply-rate = 4595.3
reply-rate = 4591.1
reply-rate = 4784.6
reply-rate = 4775.9
reply-rate = 4563.3
reply-rate = 4872.3
reply-rate = 4948.8
reply-rate = 4853.0
reply-rate = 4551.3
reply-rate = 4587.3
reply-rate = 4885.7
reply-rate = 4900.2
Maximum connect burst length: 1
Total: connections 1000 requests 500000 replies 500000 test-duration 104.059 s
Connection rate: 9.6 conn/s (104.1 ms/conn, <=1000 concurrent connections)
Connection time [ms]: min 34704.2 avg 93867.4 max 97177.5 median 95862.5 stddev 6293.6
Connection time [ms]: connect 0.0
Connection length [replies/conn]: 500.000
Request rate: 4805.0 req/s (0.2 ms/req)
Request size [B]: 62.0
Reply rate [replies/s]: min 4551.3 avg 4792.8 max 5045.8 stddev 144.4 (20 samples)
Reply time [ms]: response 187.7 transfer 0.0
Reply size [B]: header 156.0 content 12.0 footer 0.0 (total 168.0)
Reply status: 1xx=0 2xx=500000 3xx=0 4xx=0 5xx=0
CPU time [s]: user 2.97 system 99.60 (user 2.9% system 95.7% total 98.6%)
Net I/O: 1079.2 KB/s (8.8*10^6 bps)
Errors: total 0 client-timo 0 socket-timo 0 connrefused 0 connreset 0
Errors: fd-unavail 0 addrunavail 0 ftab-full 0 other 0

httperf: Ruby EventMachine

[root@mail ~]# httperf --port=3001 --num-conns=1000 --num-calls=500 --rate 100 -v
httperf --verbose --client=0/1 --server=localhost --port=3001 --uri=/ --rate=100 --send-buffer=4096 --recv-buffer=16384 --num-conns=1000 --num-calls=500
httperf: maximum number of open descriptors = 1024
reply-rate = 11631.7
reply-rate = 9769.5
reply-rate = 9352.3
reply-rate = 10086.1
reply-rate = 8899.4
reply-rate = 9759.3
reply-rate = 9985.1
reply-rate = 10152.8
reply-rate = 10383.9
Maximum connect burst length: 1
Total: connections 1000 requests 500000 replies 500000 test-duration 49.590 s
Connection rate: 20.2 conn/s (49.6 ms/conn, <=984 concurrent connections)
Connection time [ms]: min 229.8 avg 39130.7 max 42870.4 median 41409.5 stddev 6775.5
Connection time [ms]: connect 0.0
Connection length [replies/conn]: 500.000
Request rate: 10082.7 req/s (0.1 ms/req)
Request size [B]: 62.0
Reply rate [replies/s]: min 8899.4 avg 10002.2 max 11631.7 stddev 756.9 (9 samples)
Reply time [ms]: response 78.3 transfer 0.0
Reply size [B]: header 65.0 content 12.0 footer 0.0 (total 77.0)
Reply status: 1xx=0 2xx=500000 3xx=0 4xx=0 5xx=0
CPU time [s]: user 2.20 system 46.84 (user 4.4% system 94.4% total 98.9%)
Net I/O: 1368.7 KB/s (11.2*10^6 bps)
Errors: total 0 client-timo 0 socket-timo 0 connrefused 0 connreset 0
Errors: fd-unavail 0 addrunavail 0 ftab-full 0 other 0

ab: Tornado

[root@mail ~]# ab -c1000 -n100000 http://127.0.0.1:3002/
This is ApacheBench, Version 2.0.40-dev <$Revision: 1.146 $> apache-2.0
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Copyright 2006 The Apache Software Foundation, http://www.apache.org/

Benchmarking 127.0.0.1 (be patient)
Completed 10000 requests
Completed 20000 requests
Completed 30000 requests
Completed 40000 requests
Completed 50000 requests
Completed 60000 requests
Completed 70000 requests
Completed 80000 requests
Completed 90000 requests
Finished 100000 requests

Server Software:        TornadoServer/0.1
Server Hostname:        127.0.0.1
Server Port:            3002

Document Path:          /
Document Length:        12 bytes

Concurrency Level:      1000
Time taken for tests:   27.996766 seconds
Complete requests:      100000
Failed requests:        0
Write errors:           0
Total transferred:      16800336 bytes
HTML transferred:       1200024 bytes
Requests per second:    3571.84 [#/sec] (mean)
Time per request:       279.968 [ms] (mean)
Time per request:       0.280 [ms] (mean, across all concurrent requests)
Transfer rate:          586.00 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0  197 1102.8      0   20998
Processing:     1   50  37.3     45    5234
Waiting:        0   49  37.4     44    5234
Total:         18  247 1109.2     45   21253

Percentage of the requests served within a certain time (ms)
  50%     45
  66%     48
  75%     52
  80%     57
  90%     77
  95%   1237
  98%   3074
  99%   3112
 100%  21253 (longest request)

ab: EventMachine

[root@mail ~]# ab -c1000 -n100000 http://127.0.0.1:3001/
This is ApacheBench, Version 2.0.40-dev <$Revision: 1.146 $> apache-2.0
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Copyright 2006 The Apache Software Foundation, http://www.apache.org/

Benchmarking 127.0.0.1 (be patient)
Completed 10000 requests
Completed 20000 requests
Completed 30000 requests
Completed 40000 requests
Completed 50000 requests
Completed 60000 requests
Completed 70000 requests
Completed 80000 requests
Completed 90000 requests
Finished 100000 requests

Server Software:
Server Hostname:        127.0.0.1
Server Port:            3001

Document Path:          /
Document Length:        12 bytes

Concurrency Level:      1000
Time taken for tests:   15.238117 seconds
Complete requests:      100000
Failed requests:        0
Write errors:           0
Total transferred:      7700077 bytes
HTML transferred:       1200012 bytes
Requests per second:    6562.49 [#/sec] (mean)
Time per request:       152.381 [ms] (mean)
Time per request:       0.152 [ms] (mean, across all concurrent requests)
Transfer rate:          493.43 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0   76 603.3      0    9000
Processing:     0   32 264.1     15   10627
Waiting:        0   31 264.1     14   10625
Total:          9  108 752.8     15   14642

Percentage of the requests served within a certain time (ms)
  50%     15
  66%     15
  75%     15
  80%     22
  90%     33
  95%     35
  98%   2999
  99%   3015
 100%  14642 (longest request)
Ruby EventMachine :gt Python Tornado, Twisted

RE: Some Questions & Thoughts re Internet Video vs the Incumbents

Since the Internet is dead and boring it won’t hurt to reply to Mark Cuban’s questions. I once wrote an idea noting that a company may resolve the problem of bandwidth by using satellite, p2p, and broadband together as one hybrid medium. Content would be dynamically requested, routed, multicast, and/or broadcast across any path based on simple supply and demand. Of course, that’s not going to happen because efficiency has never made money and Mark hasn’t made a 3-pointer while doing a back flip.

The Internet is no more dead than Mark Cuban’s stewardship of a sports team; I won’t draw the correlation but I think you’ll get it. The Internet has changed business forever by practically eliminating the cost of asynchronous communication. It’s a simple mechanism upon which countless inventions will take root in, envelope, and culminate our society. What is the Internet if it isn’t the best invention short of Electricity? Since when has Electricity been a technology we aren’t trying to do something new with? I’m not saying the Internet isn’t boring but it’s not dead.

What new doldrums-defying, life-changing “thing” isn’t going to be implicitly catalyzed by the Internet?

Speaking of dead Interwebs; if someone is good at doing something we need to encourage them, keep them around, and not throw them away because they’re not creating new super-profitable ideas. We put a man on the moon, and space got dead and boring; then we forgot how we did it because nobody cared. Not so bad, we’ve got time to figure that crap out again and it’s not important anyways… right?

Don’t kill the Internet. Please.

= How many people have really given up cable or satellite for internet only delivery of content ? 100k at the most  ? Based on company reports, it seems like people are giving up their wired telephone lines at home long before they give up their cable/sat/telco TV

+ The real numbers to look at are what mediums are gaining numbers and where new content is coming from. Why isn’t Rev3 on Satellite? — I bet there’s a really good reason which contradicts the sustenance of mainstream distribution.

= Why are DVR sales continuing to climb ? if the internet is a better solution, why buy, lease or even use a DVR ? Shouldn’t DVRs be immediately obsolete ?

+ Because consumers have too much money to spend. It’s almost like asking why we pay $100/month for TV service to begin with. Side note, my room mate’s service included the DVR for free.

= Technology doesn’t always move in the direction you expect it to.  Anyone for faster airplanes ? The return of the Concorde ? More efficient electricity grids ? More fuel efficient cars ? You can blame the lack of progress on the incumbents or their industry, but doesn’t that make my point ?

+ Yep. Just like the banks behind our economy new society, incumbents win.

= Read this great post from the NetFlix Blog Why do people ignore in last mile and  home bandwidth constraints ? More devices at home, more utilization,  more hard drive storage, require more backups, which consume bandwidth, whether local or online.

+ People don’t ignore bandwidth constraints; it’s only now, after video and p2p have become mainstream, that people are even noticing congestion. People don’t know any better and when they do, they don’t have a choice otherwise. Again, see incumbents.

= Why do people think that bandwidth to and in the home will grow faster than applications can consume it  ?  If you believe in the inevitable progress of technology and innovation then shouldn’t you believe that this collective genius will come up with better uses of increasing bandwidth than replacing TV ? I certainly do. Health Care, Security, Who knows what, have to be a better and more rewarding use of bandwidth than just TV.

+ This is really an issue independent of video because it goes back to incumbents. It CAN happen but it won’t because it would be Seppuku. Supplying more bandwidth than demand will make it worthless and that just isn’t going to happen in the US anytime soon. Our population density is too low for cheap bandwidth to be profitable and content corporations have too much private control over distribution via contracts and licensing to allow a network which would enable free streaming video content. Personally, I think Internet in the home will remain a luxury for the next few generations because as of right now people generally care more about their entertainment and comfort more than the health and security.

= Always remember that the long tail of content, whether audio or video, never gets paid. Thats why its on the long tail.  One hit wonders do not disprove the rule. Creating hits is hard and very much a numbers game. Any content game that is a numbers game is expensive to play. Which explains exactly why there are so few internet video only companies (our friends at Rev3 being hopefully a shining exception) making money.

+ Rev3 is around because those guys know how to kick back and say “fuck it” and so does their audience :) Well, that, and because I bought a TShirt from Jim Louderback.

= P2P has been around for how many years ? It has yet to find commercial success anywhere. Its not a solution to any problem and in fact is a huge risk. Anyone with any sense of fairplay knows “free bandwidth” for commercial distribution of content is inherently wrong.

+ Wrong, sort of. p2p has found commercial success, just in a quiet, not-so-profitable fashion. Finding (new) commercial success in p2p now would be silly, akin to commercial success in a new paging company. p2p protocols were developed and published in an open fashion so they were quickly and easily integrated into and adopted by systems which could take advantage of them. Three abundant uses of p2p are software distribution, backups, and patches (ie, Blizzard’s updaters and Valve’s STEAM network). My guess is that any major software corporation who isn’t using p2p for software distribution is doing so because their cost for bandwidth is so low that it would cost more for them to staff and support a p2p system. That said, I would agree that “p2p networks” have not found commercial success.

= For all those that think there will be an explosion in bandwidth, remember we are in at least a recession, if not worse. Don’t expect any capital to be invested to take the last mile to multiples of current experiences. In fact, you might see the opposite as capital constraints encourage networks to try to manage as best they can with what they have. It could be far worse on the wireless front as lack of capital could shut down installs.

Short term: Internet is not dead.
Long term: Internet is not dead.

RE: Some Questions & Thoughts re Internet Video vs the Incumbents

Ruby on Rails, Developers, and More: In Demand NOW!

I attended this months SpringStage Startup Happy Hour and came to a conclusion after I spoke with a handful of startup employees and owners: Businesses and startups are hiring.

Update: I failed to post justifying information when I wrote this post. I have had individuals contact me regarding 7 positions since the economy started to take a crap in September. Three for rails development, one for PHP, two for product manager, and one for iPhone development. All of these positions came to me through LinkedIn, Facebook, Twitter, and SpringStage.

A few reasons why…

  1. Bootstrapped technology startups have low costs thanks to Amazon Web Services and co-working environments
  2. Marketing expenses create more value because word of mouth is faster and presence in large social networks is free
  3. if a > 12mo old startup hasn’t fallen through the cracks by now, they’re doing pretty damn good
  4. Mobile application platforms are now mainstream
  5. Investors are still investing

If the number of startups looking for good talent in the Dallas area is anything like the rest of the US then it seems to me it won’t take long for our newly unemployed to get back on the saddle.

Ruby on Rails, Developers, and More: In Demand NOW!

Ma.gnolia: “Don’t do your own IT at all” *sigh*

Having managed systems for large (>1m users) web applications I had to watch the podcast of Chris Messina and Larry Halff discuss lessons learned from the recent catastrophe which took down Ma.gnolia. In a nutshell, Ma.gnolia didn’t have good backups and the worst case scenario took place which led to massive data loss. I would like to give Larry kudos for upfront transparency and facing his users; I’m sure dealing with the problem was stress enough. I feel Larry’s earned a red badge of courage regardless of outcome.

“If you’re a startup, don’t do your own IT at all.”
-Larry Halff

This quote is Larry’s “overall lesson learned” and for the sake of other web applications I’m going to take this out of context and completely disagree. As cool as Ma.gnolia was and regardless of where it may go, this is the wrong lesson and in fact, it’s bad advice and presents problems which got Ma.gnolia into this situation in the first place.

The real lesson here is that IT should not be underestimated. Business depends on IT and it’s necessities are as dire as employees’. Unfortunately it’s easy to overlook IT and when you do there’s rarely evidence until it’s too late. Not paying attention to IT is like running red lights and stop signs — one day you’re going to get hit and it’s going to hurt. Outsourcing just makes you a passenger without a seat belt. If you’re a startup, especially a technology startup, you need to be in the driver’s seat.

At one point in time this mostly applied to technology-based companies but that time has passed for a few reasons. First, IT is flooded with service providers who suck and service providers rarely admit liability. Second, companies with cores businesses outside of technology are no longer able to remain competitive without including technology in their core business processes. Five years ago you could outsource your IT if your business revolved around paperwork or physical labor but if you’re doing so now you’re facing significant costs and risk versus training or recruiting in-house personnel. I would post numbers to back myself up for these costs but anyone can find numbers for their cause.

I’m not saying that outsourcing isn’t the right thing to do, in fact, it can be a competitive advantage; but you can’t outsource the responsibility bound to your IT systems. Fortunately Ma.gnolia can rebuild a better service with no capital, limited effort, and no defense lawyers. In doing so, their IT systems should be treated with as much priority as core operations and as much scrutiny as financials. Your CPA isn’t responsible for your tax errors and your service provider won’t be responsible if their backup system fails (it happens more than you think!). If you’re going to outsource IT then know what you’re getting into, maintain your options on a monthly or quarterly basis, have a backup plan, and don’t put all your eggs in one basket. This time Ma.gnolia has the responsibility of not fucking up their reputation a second time. If I were Ma.gnolia I wouldn’t want to outsource the rest of my dignity. If I did, I would be very scrupulous.

In the area of web applications there a few prominent service providers I know of which could be considered if you’re going to have someone else do your IT. Those are Engine Yard, Rails Machine, and RightScale. YMMV and you must keep doing your homework on service providers because while they maybe good today they could be gone tomorrow. Another option, which Larry is taking and should be applauded, is to open source the effort and involve your community in the process. Twitter has gone through a similar affair and I’m sure the service is better for it and am confident Ma.gnolia will also see positive results.

I don’t feel as though Larry is headed in the wrong direction or that I can speak for Ma.gnolia but if you’re a startup you should do your own IT – just don’t do it alone.

As an aside I’d also like to point out a subject that’s important and should be considered no matter what IT solutions are chosen. Monitoring, gives you a better chance at resolving issues before they become critical as well as a constant perspective of operations. Without monitoring you might not know if backups succeed, if your hard disks are healthy, or if you’re running out of disk space. Moreover, monitoring tells everyone involved what systems are crucial as well as providing history for analysis. When considering IT systems don’t forget the proactive value of monitoring.

Ma.gnolia: “Don’t do your own IT at all” *sigh*