Top Five Ideas from Every Class

Dec 2024

As senior year’s midway mark sees the sun set, all its learnings slip away we shall not let.

6.S894 Accelerated Computing

  1. Creativity counts in GPU land! Every task has a new art for how to partition work in parallel.
  2. GPU architecture: NVIDIA H100s have 132 streaming multiprocessors (SMs) with 4 physical warps each with 128 CUDA threads. Every SM has 256kb of shared memory -- fast! Each CUDA thread has 64kb of registers -- fastest! Bandwidth from global memory is slow, but a prescient program can queue loads asynchronously. The software model exposes a magic variable called threadIdx.x to distinguish threads and warps. SMs are not synchronized, but warps are guaranteed to stay fed. A kernel can request more than 4 warps and more than 128 CUDA threads, and the hardware will schedule instructions to ensure each virtual warp makes progress; scheduling more helps the hardware stay active. Shared memory is built from 32 banks; addressing memory without overlap mod 32 avoids serial loads from one bank ("bank conflict").
  3. Matrix multiply: the name of the game is keeping tensor cores fed. A tensor core is a special part of each SM that performs small matrix multiplies, usually 16x16. They go crazy fast---on an H100, around 900 TFLOPs across all tensor cores, compared to under 100 TFLOPs for the rest of the arithmetic hardware. Matrix multiply is all about splitting up the 3D cube of i, j, k indices (A[i,k], B[k, j], C[i, j]) into helpful block sizes. Sums of outer products are best for minimal memory bandwidth per FLOP. Cache locality and strides count, too. Choosing the right block size can mean the difference between speed and snailishness.
  4. Parallel scan: this data parallel primitive is useful in many places! Scan takes in a list and accumulates an associative operation, often to create a cumulative sum. For list size 2^k, there's a brilliant algorithm that runs in 2k parallel steps. The first k steps operate on all pairs, then quartets, then octets, and so on to cascade partial power-of-two accumulations until the final element of the list is correct. The next k steps work backward to fill these partial accumulations into the remaining spaces, descending down to octets, then quartets, and finally operating over every pair once more. For longer lists, break the task into blocks and then combine. Example of scan: we want to render one million circles. On a given pixel patch of size 64x64, we want to know which circles are relevant to check more carefully later. We create a binary list of length 1,000,000, marking each circle with 1 if its rectangular bounding box overlaps our patch. We scan this list to create an index list. The final element is the number of relevant circles! The other elements are indices we can use to map the full circle list into the compressed, relevant circle list. Then we can iterate over the relevant circles for per-pixel render checks.
  5. New libraries like ThunderKittens aspire to put the power of building fast kernels into the hands of many more people. ThunderKittens is both beautiful and rough; missing documentation meant diving into its source code, line-by-line, to figure out how it works. Coding in its model felt like walking along a tightrope: straightforward until one foot is out of line, and then we must grapple with the great precipice of complexity below. As it stands, most libraries for GPUs remain in the "trust nobody" stage. Documentation is old, missing, or wrong. The tooling holds incredible power, but its rough edges demand hours for winning expertise. As time goes on, it does not have to stay this way.

"Well, one thing you could think about is..." - Jonathan Ragan-Kelley during live lab every Tuesday

6.4590 Frontiers of Information Policy

  1. The law is our effort to steer society well. We may draw on wisdom from many places, from common law to the Constitution. And the United States Constitution is replete with good ideas based on history and a negotiated striving. In times when we shape the law, we must use our moral compass expansively: weigh the real interests on both sides. Free speech checks government control; innovation balances safety; police must mesh with private liberty.
  2. The Fourth Amendment grew out of Semayne's Case (1604 - Sir Edward Coke), which held that "a man's home is his castle." The government may not invade a man's home nor consider it a crime if a man kills an invader in his home out of self-defense. The despised "general warrant" of British soldiers led the founding fathers to require a specific warrant in advance of any intrusion on a home. Boyd v. United States (1886) said compulsory production of papers is equivalent to unreasonable search and seizure: if you didn't produce the papers, they could be interpreted maximally negatively; therefore, it really was necessary to turn them in. Weeks v. United States (1914) put the teeth into the Fourth Amendment by forcing evidence to be dismissed in court if it was acquired in violation of the Fourth Amendment. Olmstead v. United States (1928) held that wire tapping in public places does not violate the Fourth Amendment. Later, Terry v. Ohio (1968) held that stop-and-frisk does not violate Fourth Amendment rights, reasoning that a policeman with facts (not hunches) indicating an imminent crime has duty to approach the person ("stop") and then, out of personal safety, the right to pat down looking for a gun ("frisk"). Douglas's dissent: this ruling is a "long step down the totalitarian path." Today's dissent can be tomorrow's majority: Katz v. United States (1968) held that wire tapping in a clear glass phone booth in public does violate Fourth Amendment protections. The test for reasonable expectation of privacy is that the person exhibited an actual (subjective) expectation of privacy and that the expectation be one that society is prepared to recognize as "reasonable." Even though an open street is not private, a phone call on an open street is private, so ruled the Justices. Usually information given to a third party is not considered private. But this changed in Carpenter v. United States (2018), which ruled that phones that automatically share your location with cell towers are like "ankle tags." Since one cannot do without a phone in today's world, location data, despite being shared with 3rd parties, still enjoys a reasonable expectation of privacy.
  3. Reno v. ACLU (1997) shaped the internet we know today. The issue was whether the federal government regulating indecent content (e.g., pornography) on the internet violates the first or fifth amendment rights of website owners. The court ruled that the Communications Decency Act of 1996 violated the First Amendment (in part -- another part was saved by severability, a concept that sometimes striking out a couple words can save a law). If the case had gone the other way, a single underaged participant on a website could chill speech for everyone else: an unreasonable veto. A lesson: the facts almost always matter most. Facts may seem trifling or quaint, but without them a case is a skyscraper built of sand.
  4. Copyright exists because a line in the Constitution empowers Congress "to promote the Progress of Science and useful Arts, by securing for limited Times to Authors and Inventors the exclusive Right to their respective Writings and Discoveries." Nowadays everyone is wondering what to do about AI in light of copyright. Should it be allowed to train on copyrighted material? Should its output be granted copyright? The Constitution suggests that the right reasoning will maximize the progress of science and useful arts, and copyright is just one device to get there. Relatedly, copyright protects writing but not content: a patent is what protects technical innovation.
  5. The Digital Millennium Copyright Act (DMCA) requires platforms to swiftly remove content if notified that it is infringing. The wording of the act leaves open whether responsibility is caused by a general sense of widespread infringement or only by specific notifications of infringement. This is important for content providers like YouTube! Viacom v. YouTube (2012) ruled that specific notifications are the correct interpretation. The ruling landed us in the modern internet of trigger-ready copyright takedowns but also the legal cocoon that allows YouTube itself to operate. Beyond this one case, through rigorous debate, the law is ours to shape to create the most sensible society we can out of legitimate tensions and opportunities.

"And I was there to argue the case!" - Professor Danny Weitzner, about Reno v. ACLU

Prod

  1. Doing things is the thing to do. Go do things!
  2. My purpose is to serve people. Doing a good job of it is hard, and discomfort while learning is all right.
  3. Every week can be beast mode. Momentum creates momentum. Pipeline, parallelize, call people, and make things happen.
  4. To keep afloat in the currents of ambition, keep inner values strong. Stand up for them. Be nice.
  5. Interviewing 20 people in a week generates hard-to-predict great insights.

"It's easier to do ridiculous things if you look ridiculous." - Spruce Campbell, dressed as a Jedi master, handing out Teenage Mutant Ninja Turtle costumes for a scavenger hunt all day around San Francisco

"Being an entrepreneur is stupid. It is like chewing glass." - Richard Dahan, common prelude to inspirational speeches

"What is your deepest goal in life? [Your answer.] Are you doing it? [No.] Well what are you waiting for?" - Richard Dahan, structure of many conversations

MEng

  1. The people you are around matter. Hop labs a few times, because you will feel when it clicks.
  2. Optimizers in deep learning should account for matrix structure in the gradient. This principle led to Muon.
  3. Deep learning is still alchemy, but thinking hard about the linear algebra can lead to principled scaling.
  4. Writing and running experiments takes nontrivial warmup time. It's all too easy to be in the flow of day-to-day life without blocking out time to write and code. Balance input with output: time spent writing and coding should never fall to zero. There is power in a deadline, too. Make them up for yourself. Writing and coding for 40 hours in three days, on one task with a friend, has untold powers for creative breakthroughs.
  5. Always Be Collecting (ABC). Differentiated insights do not come from thinking harder, but from data. Run experiments. Interact with the universe. The same goes in research as in entrepreneurship and in life.

"Hey Laker, I was thinking about Stiefel manifolds last night..." - Jeremy Bernstein