NETS 2120: Scalable and Cloud Computing
NETS 2120 is one of the flagship courses of the NETS curriculum, and Penn was one of the first universities to offer a course of this type. The course is aimed at sophomores; it covers the science that is behind massive-scale software systems, such as the platforms operated by Google, Facebook, or Amazon, and it also aims to give the students some "hands-on" experience with the underlying technology. The technical topics include the basics of programming at scale, dealing with faults, relaxed consistency models, network security, etc., as well as Spark and its programming model. We look at several "big data" algorithms, such as PageRank and adsorption, and I present case studies of commercial cloud services. The semester ends with a discussion of current topics, such as blockchains and differential privacy. On the practical side, the course contains four major programming assignments that are completed using Amazon's Web Services platform. Each assignment results in a simple but usable service: a search portal for TED talks, a social network analysis, or a restaurant database.
The final project is to build a cloud-based "mini-Facebook" in teams of four. The required features include a profile page, a "wall" with posting and commenting, a chat, and news recommendations; however, I also give extra credit for creativity and special features, and each year there are plenty of opportunities to award it. For instance, I have seen teams implement Pokemon-themed solutions or social networks for pets, advanced privacy features, or sentiment analysis. Some solutions have even acquired a small user base! As an additional incentive, I have worked with the "real" Facebook to offer a small prize for the best final project.
CIS 5550: Internet and Web Systems
CIS 5550 is an advanced course that is open to both graduate and senior undergraduate students. At its core, it is a distributed systems class, and as such it covers many general distributed-systems topic, such as scalability, interoperability, consistency models, replication, and fault tolerance. However, it is unusual in that it uses one particular web system – Google's search engine – as a case study, and in that it teaches the students all the key elements that go into a system like this. Thus, we also cover the basics of crawling, indexing, and ranking; at the end, students gain a comprehensive understanding of how a massive-scale web system really works.
As with NETS 2120, the course includes a series of programming assignments that expose the students to various aspects of a web system and that lead towards a large final project at the end. The project is to build a small search engine – not unlike Google in its early days – in teams of four, and the assignments each develop a basic version of a component that the student will need for this: a web and application server, a scalable web crawler, and a simple clone of Apache Spark.