The API Evangelist Blog
This blog represents the thoughts I have while I'm research the world of APIs. I share what I'm working each week, and publish daily insights on a wide range of topics from design to depcration, and spanning the technology, business, and politics of APIs. All of this runs on Github, so if you see a mistake, you can either fix by submitting a pull request, or let me know by submitting a Github issue for the repository.
I’m neck deep in research around data and APIs right now, and after looking at 37 of the Apache data projects it is pretty clear that web APIs are not a priority in this world. There are some of the projects that have web APIs, and there a couple projects that look to bridge several of the projects with an aggregate or gateway API, but you can tell that the engineers behind the majority of these open source projects are not concerned with access at this level. Many engineers will counter this point by saying that web APIs can’t handle the volume, and it shows that the concept isn’t applicable in all scenarios. I’m not saying web APIs should be used for the core functionality at scale, I’m saying that web APIs should be present to provide access to the result state of the core features for each of these platform, whatever that is, which something that web APIs excel at.
From my vantage point the lack of web APIs isn’t a technical one, it is a business and political motivation. When it comes to big data the objectives are always about access, and it definitely isn’t about the wide audience access that comes when you use HTTP, and the web for API access. The objective is to aggregate, move around, and work with as much data as you possibly can amongst a core group of knowledgable developers. Then you distribute awareness, access, and usage to designated parties via distilled analysis, visualizations, or in some cases to other systems where the result can be accessed and put to use. Wide access to this data is not the primary objective, paying forward much of the power and control we currently see around database to API efforts. Big data isn’t about democratization. Big Data is about aggregating as much as you can and selling the distilled down wisdom from analysis, or derived as part of machine learning efforts.
I am not saying there is some grand conspiracy here. It just isn’t the objective of big data folks. They have their marching orders, and the technology they develop reflect these marching orders. It reflects the influence money and investment has on the technology. The ideology that drives how the tech is engineered, and the algorithms handle specific inputs, and provide intended outputs. Big data is often sold as data liberation, democratization, and access to your data, building on much of what APIs have done in recent years. However, in the last couple of years the investment model has shifted, the clients who are purchasing and implementing big data have evolved, and they aren’t your API access type of people. They don’t see wide access to data as a priority. You are either in the club, and know how to use the Apache X technology, or you are sanctioned one of the dashboard analysis visualization machine learning wisdom drips from the big data. Reaching a wide audience is not necessary.
For me, this isn’t some amazing revelation. It is just watching power do what power does in the technology space. Us engineers like to think we have control over where technology goes, yet we are just cogs in the larger business wheel. We program the technology to do exactly what we are paid to do. We don’t craft liberating technology, or the best performing technology. We assume engineer roles, with paychecks, and bosses who tell us what we should be building. This is how web APIs will fail. This is how web APIs will be rendered yesterdays technology. Not because they fail technically, it is because the ideology of the hedge funds, enterprise groups, and surveillance capitalism organizations that are selling to law enforcement and the government will stop funding data systems that require wide access. The engineers will go along with it because it will be real time, evented, complex, and satisfying to engineer in our isolated development environments (IDE). I’ve been doing data since the 1980s, and in my experience this is how data works. Data is widely seen as power, and all the technical elements, and many of the human elements involved often magically align themselves in service of this power, whether they realize they are doing it or not.
I remember when almost all the APIs out there gave us developers access to things we couldn’t ever possibly get on our own. Some of it was about the network effect with the early Amazon and eBay marketplaces, or Flickr and Delicious, and then Twitter and Facebook. Then what really brought it home was going beyond the network effect, and delivering resources that were completely out of our reach like maps of the world around us, (seemingly) infinitely scalable compute and storage, SMS, and credit card payments. In the early days it really seemed like APIs were all about giving us access to something that was out of our reach as startups, or individuals.
While this still does exist, it seems like many APIs have flipped the table and it is all about giving them access to our personal and business data in ways that used to be out of their reach. Machine learning APIs are using parlour tricks to get access to our internal systems and databases. Voice enablement, entertainment, and cameras are gaining access to our homes, what we watch and listen to, and are able to look into the dark corners of our personal lives. Tinder, Facebook, and other platforms know our deep dark secrets, our personal thoughts, and have access to our email and intimate conversations. The API promise seems to have changed along the way, and stopped being about giving us access, and is now about giving them access.
I know it has always been about money, but the early vision of APIs seemed more honest. It seemed more about selling a product or service that people needed, and was more straight up. Now it just seems like APIs are invasive. Being used to infiltrate our professional and business worlds through our mobile phones. It feels like people just want access to us, purely so they can mine us and make more money. You just don’t see many Flickrs, Google Maps, or Amazon EC2s anymore. The new features in mobile devices we carry around, and the ones we install in our home don’t really benefit us in new and amazing ways. They seem to offer just enough to get us to adopt them, and install in our life, so they can get access to yet another data point. Maybe it is just because everything has been done, or maybe it is because it has all been taken over by the money people, looking for the next big thing (for them).
Oh no! Kin is ranting again. No, I’m not. I’m actually feeling pretty grounded in my writing lately, I’m just finding it takes a lot more work to find interesting APIs. I have to sift through many more emails from folks telling me about their exploitative API, before I come across something interesting. I go through 30 vulnerabilities posts in my feeds, before I come across one creative story about something platform is doing. There are 55 posts about ICOs, before I find an interesting investment in a startup doing something that matters. I’m willing to admit that I’m a grumpy API Evangelist most of the time, but I feel really happy, content, and enjoying my research overall. I just feel like the space has lost its way with this big data thing, and are using APIs to become more about infiltrating and extraction, that it is about delivering something that actually gives developers access to something meaningful. I just think we can do better. Something has to give, or this won’t continue to be sustainable much longer.
I’ve talked about this before, but after reading several articles recently about various federal government agencies collecting, and using social media accounts for surveillance lately, it is a drum I will be beating a lot more regularly. Along with the transparency reports we are beginning to see emerge from the largest platform providers, I’d like to start seeing more observability regarding which accounts, both user and developer are out of government agencies. Some platforms are good at highlighting how government of all shapes and sizes are using their platform, and some government agencies are good at showcasing their social media usage, but I’d like to understand this from purely an API developer account perspective.
I’d like to see more observability into which government agencies are requesting API keys. Maybe not specific agencies ad groups, and account details, although that would be a good idea as well down the road. I am just looking for some breakdown of how many developer accounts on a platform are government and law enforcement. What does their API consumption look like? If there is Oauth via a platform, is there any bypassing of the usual authentication flows to get at data, any differently than regular developers would be accessing, or requiring user approval? From what I am hearing, I’m guessing that there are more government accounts out there than platforms either realize, or are willing to admit. It seems like now is a good time to start asking these questions.
I would add on another layer to this. If an application developer is developing applications on behalf of law enforcement, or as part of a project for a government agency, there should be some sort of disclosure at this level as well. I know I’m asking a lot, and a number of people will call me crazy, but with everything going on these days, I’m feeling like we need a little more disclosure regarding how government(s) are using our platforms, as well as their contractors. The transparency disclosure that platforms have been engaging is a good start to the legal side of this conversation, but I’m looking for the darker, more lower level surveillance that I know is going on behind the scenes. The data gathering on U.S. citizens that doesn’t necessarily violate any particular law, because this is such new territory, and the platform terms of service might sanction it in some loopholy kind of way.
This isn’t just a U.S. government type of thing. I want this to be standard practice for all forms of government on the platforms we use. A sort of UN level, General Data Protection Regulation (GDPR). Which reminds me. I am woefully behind on what GDPR outlines, and how the rolling out of it is going. Ok, I’ll quick ranting now, and get back to work. Overall, we are going to have to open up some serious observability into how the online platforms we are depending are being accessed and use by the government, both on the legal side of things, as well as just general usage. Seems like the default on the general usage should always be full disclosure, but I’m guessing it isn’t a conversation anyone is having yet, which is why I bring up. Now we are having it. Thanks.
I encounter a number of folks who really, really, really want to do APIs. You know, because they are cool and all, but they just can’t do what it takes to let go a little, so that their valuable API resources can actually be put to use by other folks. Sometimes this happens because they don’t actually own the data, content, or algorithms they are serving up, but in other cases it is because they view their thing as being so valuable, and so important that they can’t share it openly enough, to be accessible via an API. Even if your APIs are private, you still have to document, and share access with folks, so they can understand what is happening, and have enough freedom to put to use in their application as part of their business, without too much constraint and restrictions.
Some folks tell me they want to do API, but I can usually tell pretty quickly that they won’t be able to go the distance. I find a lot of this has to do with perceptions of intellectual property, combined with a general distrust of EVERYONE. My thing is super valuable, extremely unique and original, and EVERYONE is going to want it. Which is why they want to do APIs, because EVERYONE will want it. Also, once it is available to EVERYONE via an API, competitors, and people we don’t want getting at it, will now be able to reverse engineer, and copy this amazing idea. However, if we don’t make accessible, we can’t get rich. Dilemna. Dilemna. Dilemna. What do we do? My answer is you probably that you shouldn’t be doing APIs.
You see, doing APIs, whether public or privately requires letting go a bit. Sure, you can dial in how much control you are willing to give up using API management solutions, but you still have to let go enough so that people can do SOMETHING with your valuable resource. If you can’t, APIs probably aren’t your jam. They just won’t work if you don’t give your API consumers enough room to breathe while developing and operating their integrations and applications. I understand if you can’t let go. The API game isn’t for everyone, or maybe there is some other data, content, and resources you don’t feel so strongly about that you could start with, and get the hang of doing APIs before you jump in with your prized possessions?
Another thing I might suggest, is that maybe you should twice about why these digital things are so important to you. Is it because they really matter to you, or is because you think they’ll make you a lot of money? If it is just the latter, they are probably not very valuable if you just keep them locked up. The best ideas are the ones that get used. The things that make the biggest impact get shared, and are usually pretty accessible. I’m guessing that most of your anxiety does not come from APIs, and what will happen when you launch them. I’m pretty sure it comes from some unhealthy views about what you have, the stories you’ve been told about intellectual property, and your obsession with getting rich. Again, which leaves us at the part of the story where you probably shouldn’t do APIs–I don’t think they are your jam.
I’m always learning from the API communication practices from out of the different AWS teams. From the regular storytelling coming out of the Alexa team, to the mythical tales of leadership at AWS that have contributed to the platform’s success, the platform provides a wealth of examples that other API providers can emulate.
As I talked about last week, finding creative ways to keep publishing interesting content to your blog as part of your API evangelism and communications strategy is hard. It is something you have to work at. One way I find inspiration is by watching the API leaders, and learning from what they do. An interesting example I recently found out of the AWS security team, was their approach to showcasing the top 20 AWS IAM documentation pages so far in 2017. It is a pretty simple, yet valuable way to deliver some content for your readers, that can also help you expose the dark corners of your API documentation, and other resources on your blog.
The approach from the AWS security team is a great way to generate content without having to come up with good ideas, but also will help with your SEO, especially if you can cross publish, or promote through other channels. It’s pretty basic content stuff, that helps with your overall SEO, and if you play it right, you could also get some SMM juice by tweeting out the store, as well as maybe a handful of the top links from your list. It is pretty listicle type stuff, but honestly if you do right, it will also deliver value. These are the top answers, in a specific category, that your API consumers are looking for answers in. Helping these answers rise to the top of your blog, search engine, and social media does your consumers good, as well as your platform.
One more tool for the API communications and evangelism toolbox. Something you can pull out when you don’t have any storytelling mojo. Which is something you will need on a regular basis as an API provider, or service provider. It is one of the tricks of trade that will keep your blog flowing, you readers reading, and hopefully your valuable API, products, services, and stories floating to the top of the heap. And that is what all of this is about–staying on top of the pile, keeping things relevant, valuable, and useful. If we can’t do that, it is time to go find something else to do.
I’m continuing my research into bot platform observability, and how API platforms are handling (or not handling) bot automation on their platforms, as I try to make sense of each wave of the bot invasion on the shores of the API sector. It is pretty clear that Twitter and Facebook aren’t that interested in taming automation on their platforms, unless there is more pressure applied to them externally. I’m looking to make sure there is a wealth of ideas, materials, and examples of how any API driven platform can (are) control bot automation on their platform, as the pressure from lawmakers, and the public intensifies.
Requiring users clearly identify automation accounts is a good first place to start. Establishing a clear designation for bot users has its precedents, and requiring developers to provide an image, description, and some clear check or flag that identifies an account as automated just makes sense. Providing a clear definition of what a bot is, with articulate rules for what bots should and shouldn’t be doing is next up on the checklist for API platforms. Sure, not all users will abide by this, but it is pretty easy to identify automated traffic versus human behavior, and having a clear separation allows accounts to automatically turned off when they fit a particular fingerprint, until a user can pass a sort of platform Turing test, or provide some sort of human identification.
Automation on API platforms has its uses. However, unmanaged automation via APIs has proven to be a problem. Platforms need to step up and manage this problem, or the government eventually will. Then it will become yet another burdensome regulation on business, and there will be nobody to blame except for the bad actors in the space (cough Twitter & Facebook, cough, cough). Platforms tend to not see it as a problem because they aren’t the targets of harassment, and it tends to boost their metrics and bottom line when it comes to advertising and eyeballs. Platforms like Slack have a different business model, which dictates more control, otherwise it will run their paying customers off. The technology, and practices already exist for how to manage bot automation on API platforms effectively, they just aren’t ubiquitous because of the variances in how platforms generate their revenue.
I am going to continue to put together a bot governance package based upon my bot API research. Regardless of the business model in place, ALL API platforms should have a public bot automation strategy in place, with clear guidelines for what is acceptable, and what is not. I’m looking to provide API platforms with a one-stop source for guidance on this journey. It isn’t rocket science, and it isn’t something that will take a huge investment if approached early on. Once it gets out of control, and you have congress crawling up your platforms ass on this stuff then it probably is going to get more expensive, and also bring down regulatory hammer on everyone else. So, as API platform operators let’s be proactive and take the problem of bot automation on directly, and learn from Twitter and Facebook’s pain.
Photo Credit: ShieldSquare
I’m spending time investing in my data, as well as my database API research. I’ll have guides, with accompanying stories coming out over the next couple weeks, but I want to take a moment to publish some of the raw research that I think paints an interesting picture about where things are headed.
When studying what is going on with data and APIs you can’t do any search without stumbling across an Apache project doing something or other with data. I found 37 separate projects at Apache that were data related, and wanted to publish as a single list I could learn from.
- Airvata** - Apache Airavata is a micro-service architecture based software framework for executing and managing computational jobs and workflows on distributed computing resources including local clusters, supercomputers, national grids, academic and commercial clouds. Airavata is dominantly used to build Web-based science gateways and assist to compose, manage, execute, and monitor large scale applications (wrapped as Web services) and workflows composed of these services.
- Ambari - Apache Ambari makes Hadoop cluster provisioning, managing, and monitoring dead simple.
- Apex - Apache Apex is a unified platform for big data stream and batch processing. Use cases include ingestion, ETL, real-time analytics, alerts and real-time actions. Apex is a Hadoop-native YARN implementation and uses HDFS by default. It simplifies development and productization of Hadoop applications by reducing time to market. Key features include Enterprise Grade Operability with Fault Tolerance, State Management, Event Processing Guarantees, No Data Loss, In-memory Performance & Scalability and Native Window Support.
- Avro - Apache Avro is a data serialization system.
- Beam - Apache Beam is a unified programming model for both batch and streaming data processing, enabling efficient execution across diverse distributed execution engines and providing extensibility points for connecting to different technologies and user communities.
- Bigtop - Bigtop is a project for the development of packaging and tests of the Apache Hadoop ecosystem. The primary goal of Bigtop is to build a community around the packaging and interoperability testing of Hadoop-related projects. This includes testing at various levels (packaging, platform, runtime, upgrade, etc…) developed by a community with a focus on the system as a whole, rather than individual projects. In short we strive to be for Hadoop what Debian is to Linux.
- BookKeeper - BookKeeper is a reliable replicated log service. It can be used to turn any standalone service into a highly available replicated service. BookKeeper is highly available (no single point of failure), and scales horizontally as more storage nodes are added.
- Calcite - Calcite is a framework for writing data management systems. It converts queries, represented in relational algebra, into an efficient executable form using pluggable query transformation rules. There is an optional SQL parser and JDBC driver. Calcite does not store data or have a preferred execution engine. Data formats, execution algorithms, planning rules, operator types, metadata, and cost model are added at runtime as plugins.
- Crunch - The Apache Crunch Java library provides a framework for writing, testing, and running MapReduce pipelines. Its goal is to make pipelines that are composed of many user-defined functions simple to write, easy to test, and efficient to run.
- DataFu - Apache DataFu consists of two libraries: Apache DataFu Pig is a collection of useful user-defined functions for data analysis in Apache Pig. Apache DataFu Hourglass is a library for incrementally processing data using Apache Hadoop MapReduce. This library was inspired by the prevalence of sliding window computations over daily tracking data. Computations such as these typically happen at regular intervals (e.g. daily, weekly), and therefore the sliding nature of the computations means that much of the work is unnecessarily repeated. DataFu’s Hourglass was created to make these computations more efficient, yielding sometimes 50-95% reductions in computational resources.
- Drill - Apache Drill is a distributed MPP query layer that supports SQL and alternative query languages against NoSQL and Hadoop data storage systems. It was inspired in part by Google’s Dremel.
- Edgent - Apache Edgent is a programming model and micro-kernel style runtime that can be embedded in gateways and small footprint edge devices enabling local, real-time, analytics on the continuous streams of data coming from equipment, vehicles, systems, appliances, devices and sensors of all kinds (for example, Raspberry Pis or smart phones). Working in conjunction with centralized analytic systems, Apache Edgent provides efficient and timely analytics across the whole IoT ecosystem: from the center to the edge.
- Falcon - Apache Falcon is a data processing and management solution for Hadoop designed for data motion, coordination of data pipelines, lifecycle management, and data discovery. Falcon enables end consumers to quickly onboard their data and its associated processing and management tasks on Hadoop clusters.
- Flink - Flink is an open source system for expressive, declarative, fast, and efficient data analysis. It combines the scalability and programming flexibility of distributed MapReduce-like platforms with the efficiency, out-of-core execution, and query optimization capabilities found in parallel databases.
- Flume - Apache Flume is a distributed, reliable, and available system for efficiently collecting, aggregating and moving large amounts of log data from many different sources to a centralized data store
- Giraph - Apache Giraph is an iterative graph processing system built for high scalability. For example, it is currently used at Facebook to analyze the social graph formed by users and their connections.
- Hama - The Apache Hama is an efficient and scalable general-purpose BSP computing engine which can be used to speed up a large variety of compute-intensive analytics applications.
- Helix - Apache Helix is a generic cluster management framework used for the automatic management of partitioned, replicated and distributed resources hosted on a cluster of nodes. Helix automates reassignment of resources in the face of node failure and recovery, cluster expansion, and reconfiguration.
- Ignite - Apache Ignite In-Memory Data Fabric is designed to deliver uncompromised performance for a wide set of in-memory computing use cases from high performance computing, to the industry most advanced data grid, in-memory SQL, in-memory file system, streaming, and more.
- Kafka - A single Kafka broker can handle hundreds of megabytes of reads and writes per second from thousands of clients. Kafka is designed to allow a single cluster to serve as the central data backbone for a large organization. It can be elastically and transparently expanded without downtime. Data streams are partitioned and spread over a cluster of machines to allow data streams larger than the capability of any single machine and to allow clusters of co-ordinated consumers. Kafka has a modern cluster-centric design that offers strong durability and fault-tolerance guarantees. Messages are persisted on disk and replicated within the cluster to prevent data loss. Each broker can handle terabytes of messages without performance impact.
- Knox - The Apache Knox Gateway is a REST API Gateway for interacting with Hadoop clusters. The Knox Gateway provides a single access point for all REST interactions with Hadoop clusters. In this capacity, the Knox Gateway is able to provide valuable functionality to aid in the control, integration, monitoring and automation of critical administrative and analytical needs of the enterprise.
- Lens - Lens provides an Unified Analytics interface. Lens aims to cut the Data Analytics silos by providing a single view of data across multiple tiered data stores and optimal execution environment for the analytical query. It seamlessly integrates Hadoop with traditional data warehouses to appear like one.
- MetaModel - With MetaModel you get a uniform connector and query API to many very different datastore types, including: Relational (JDBC) databases, CSV files, Excel spreadsheets, XML files, JSON files, Fixed width files, MongoDB, Apache CouchDB, Apache HBase, Apache Cassandra, ElasticSearch, OpenOffice.org databases, Salesforce.com, SugarCRM and even collections of plain old Java objects (POJOs). MetaModel isn’t a data mapping framework. Instead we emphasize abstraction of metadata and ability to add data sources at runtime, making MetaModel great for generic data processing applications, less so for applications modeled around a particular domain.
- Oozie - Oozie is a workflow scheduler system to manage Apache Hadoop jobs. Oozie is integrated with the rest of the Hadoop stack supporting several types of Hadoop jobs out of the box (such as Java map-reduce, Streaming map-reduce, Pig, Hive, Sqoop and Distcp) as well as system specific jobs (such as Java programs and shell scripts).
- ORC - ORC is a self-describing type-aware columnar file format designed for Hadoop workloads. It is optimized for large streaming reads, but with integrated support for finding required rows quickly. Storing data in a columnar format lets the reader read, decompress, and process only the values that are required for the current query.
- Parquet - Apache Parquet is a general-purpose columnar storage format, built for Hadoop, usable with any choice of data processing framework, data model, or programming language.
- Phoenix - Apache Phoenix enables OLTP and operational analytics for Apache Hadoop by providing a relational database layer leveraging Apache HBase as its backing store. It includes integration with Apache Spark, Pig, Flume, Map Reduce, and other products in the Hadoop ecosystem. It is accessed as a JDBC driver and enables querying, updating, and managing HBase tables through standard SQL.
- REEF - Apache REEF (Retainable Evaluator Execution Framework) is a development framework that provides a control-plane for scheduling and coordinating task-level (data-plane) work on cluster resources obtained from a Resource Manager. REEF provides mechanisms that facilitate resource reuse for data caching, and state management abstractions that greatly ease the development of elastic data processing workflows on cloud platforms that support a Resource Manager service.
- Samza - Apache Samza provides a system for processing stream data from publish-subscribe systems such as Apache Kafka. The developer writes a stream processing task, and executes it as a Samza job. Samza then routes messages between stream processing tasks and the publish-subscribe systems that the messages are addressed to.
- Spark - Apache Spark is a fast and general engine for large-scale data processing. It offers high-level APIs in Java, Scala and Python as well as a rich set of libraries including stream processing, machine learning, and graph analytics.
- Sqoop - Apache Sqoop(TM) is a tool designed for efficiently transferring bulk data between Apache Hadoop and structured datastores such as relational databases.
- Storm - Apache Storm is a distributed real-time computation system. Similar to how Hadoop provides a set of general primitives for doing batch processing, Storm provides a set of general primitives for doing real-time computation.
- Tajo - The main goal of Apache Tajo project is to build an advanced open source data warehouse system in Hadoop for processing web-scale data sets. Basically, Tajo provides SQL standard as a query language. Tajo is designed for both interactive and batch queries on data sets stored on HDFS and other data sources. Without hurting query response times, Tajo provides fault-tolerance and dynamic load balancing which are necessary for long-running queries. Tajo employs a cost-based and progressive query optimization techniques for optimizing running queries in order to avoid the worst query plans.
- Tez - Apache Tez is an effort to develop a generic application framework which can be used to process arbitrarily complex directed-acyclic graphs (DAGs) of data-processing tasks and also a reusable set of data-processing primitives which can be used by other projects.
- VXQuery - Apache VXQuery will be a standards compliant XML Query processor implemented in Java. The focus is on the evaluation of queries on large amounts of XML data. Specifically the goal is to evaluate queries on large collections of relatively small XML documents. To achieve this queries will be evaluated on a cluster of shared nothing machines.
- Zeppelin - Zeppelin is a modern web-based tool for the data scientists to collaborate over large-scale data exploration and visualization projects.
There is a serious amount of overlap between these projects. Not all of these projects have web APIs, while some of them are all about delivering a gateway or aggregate API across projects. There is a lot to process here, but I think listing them out provides an easier way to understand the big data explosion of projects over at Apache.
It is tough to understand what each of these do without actually playing with them, but that is something I just don’t have the time to do, so next up I’ll be doing independent searches for these project names, and finding stories from across the space regarding what folks are doing with these data solutions. That should give me enough to go on when putting them into specific buckets, and finding their place in my data, and database API research.
I’m evolving forward my thoughts on algorithmic observability and transparency using APIs, and I was recently introduced to TLA+, or the Temporal Logic of Actions. It is the closest I’ve come to what I’m seeing in my head when I think about how we can provide observability into algorithms through existing external outputs (APIs). As I do with all my work here on API I want to process TLA+ as part of my API research, and see how I can layer it in with what I already know.
TLA+ is a formal specification language developed by Leslie Lamport, which can be used to design, model, document, and verify concurrent systems. It has been described as exhaustively-testable pseudocode which can provide a blueprint for software systems. In the context of design and documentation, TLA+ can be viewed as informal technical specifications. However, since TLA+ specifications are written in a formal language of logic and mathematics it can be used to uncover design flaws before system implementation is underway, and are amenable to model checking for finding all possible system behaviours up to some number of execution steps, and examines them for violations. TLA+ specifications use basic set theory to define safety (bad things won’t happen) and temporal logic to define liveness (good things eventually happen).
TLA+ specifications are organized into modules.Although the TLA+ standard is specified in typeset mathematical symbols, existing TLA+ tools use symbol definitions in ASCII, using several terms which require further definition:
- State - an assignment of values to variables
- Behaviour - a sequence of states
- Step - a pair of successive states in a behavior
- Stuttering Step - a step during which variables are unchanged
- Next-State Rlation - a relation describing how variables can change in any step
- State Function - an expression containing variables and constants that is not a next-state relation
- State Predicate - a Boolean-valued state function
- Invariant - a state predicate true in all reachable states
- Temporal Formula - an expression containing statements in temporal logic
TLA+ is concerned with defining the correct system behavior, providing with a set of operators for working through what is going on, as well as working with data structures. There is tooling that has been developed to support TLA+ including an IDE, model checker, and proof system. It is all still substantially over my head, but I get what is going on enough to warrant moving forward, and hopefully absorbing more on the subject. As with most languages and specifications I come across it will just take some playing with, and absorbing the concepts at play, before things will come into focus.
I’m going to pick up some of my previous work around behavior driven assertions, and how assertions can be though of in terms of the business contracts APIs put forward, and see where TLA+ fits in. It’s all still fuzzy, but API assertions and TLA+ feels like where I want to go with this. I’m thinking about how we can wrap algorithms in APIs, write assertions for them, and validate across the entire surface area of an algorithm, or stack of API exposed algorithms using TLA+. Maybe I’m barking up the wrong tree, but if nothing else it will get me thinking more about this side of my API research, which will push forward my thoughts on algorithmic transparency, and audit-able observability.
I am working with a team to expose a database as an API. With projects like this there can be a lot of anxiety in exposing a database directly as an API. Security is the first one, but in my experience, most of the time security is just cover for anxiety about a messy backend. The group I’m working with has been managing the same database for over a decade, adding on clients, and making the magic happen via a whole bunch of databases and table kung fu. Keeping this monster up and running has been priority number one, and evolving, decentralizing, or decoupling has never quite been a priority.
The database team has learned the hard way, and they have the resources to keep things up and running, but never seem to have them when it comes to refactoring it and thinking differently, let alone tackling the delivery of a web API on top of things. There will need to be a significant amount of education and training around REST, and doing APIs properly before we can move forward, something there really isn’t a lot of time or interest in doing. To help bridge the gap I am suggesting that we do an entirely new API, with it’s own database, and we focus on database to database communication, since that is what the team knows. We can launch an Amazon RDS instance, with an EC2 instance running the API, and the database team can work directly with RDS (MySQL) which they are already familiar with.
We can have a dedicated API team handle the new API and database, and the existing team can handle the syncing from database to database. This also keeps the messy, aggregate, overworked database out of reach of the new API. We get an API. The database team anxiety levels are lowered. It balances things out a little. Sure there will still be some work between databases, but the API can be a fresh start, and it won’t be burdened by the legacy. The database to database connection can carry this load. Maybe once this pilot is done, the database team will feel a little better about doing APIs, and be a little more involved with the next one.
I am going to pitch this approach in coming weeks. I’m not sure if it will be well received, but I’m hoping it will help bridge the new to the old a little bit. I know the database team likes to keep things centralized, which is one reason they have this legacy beast, so there might be some more selling to occur on that front. Doing APIs isn’t always about the technical. It is often about the politics of how things get done on the ground. Many organizations have messy databases, which they worry will make them look bad when any of it is exposed as an API. I get it, we are all self-conscious about the way our backends look. However, sometimes we still need to find ways to move things forward, and find compromise. I hope this database to database, then to API does the trick.
As each wave of technology comes crashing on the shores of the API space you’ll mostly find me silent, listening and watching what is happening. Occasionally you’ll hear me grumble about the aggressiveness of a single wave, or how unaware each wave is of the rest of the beach, or of the waves that came before them. Mostly I am just yelling back to the waves that claim, “we are going to change the beach forever”, and “we are the wave that matters, better than all the waves that came before us”. Mostly, it is the hype, and the unrealistic claims being made by each wave that bothers me, not the waves themselves.
I do not think that technology won’t have an impact on the beach. I just think that us technologists tend to over-hype, and over-believe in the power each wave of technology, and that we do not consider the impact on the wider beach, and the amount of sand that ends up in everything. I don’t doubt that there will be some gems found in the sand, and that geologically speaking that the ocean plays a significant role in how the coastline is shaped. I’m just choosing to sit back on the bluff and enjoy my time on the beach, and not choosing to be a three year old playing in each of the waves, super excited by the sound each crash makes on the beach. I’m not saying that playing in the waves is wrong, I’m just choosing to look at the bigger picture from up here on the bluff.
You can see one such canvas being painted over the last couple of years with what has become to be known as “bots”. Little automated nuggets of tech goodness, or evil, depending on your location on the beach. People love saying that bots will change everything. They’ll be your assistant. They’ll do everything for you. They’ll automate your life. Take care of your parking tickets. Buy your groceries. Raise your children. Feed hungry people in Africa. When in reality, they tend to be annoying, harassing, and can be mess up an entire election kind of bad. They can DDoS. They can threaten to kill and rape you. But, hey, let’s keep investing in them, and building platforms that support them, without ever acknowledging the negative consequences they have on our beautiful beach.
Some days when I’m swimming in the bot waves I’ll be completely consumed. The undertow grabs me, spins me around, and I don’t know which way is up, and I end up with a mouth and ass-crack full of sand before I can make my way to the beach. I felt this way over last Christmas as I tried to make sense of the fake news engine, and what was coming out of Russia. Other days I feel like I’m walking on the beach collecting agates, finding some polished glass, but occasionally also finding some really beautiful agates. Today is one of those days, and I seem to be finding more bots that are actually useful, and do one thing well, without all the bullshit, and hype. Showing me the potential of this technology, and the specs of usefulness it can bring to our silicon beach.
I’m finding useful bots that will convert a file for me. Transcribe a audio file. Notify me of a changes within my domain(s), which happens to be beach front real estate. Amidst all the sand and gravel I am seeing meaningful bot implementations. They aren’t anywhere near the automation we have been promised, but they are providing some value. Today I have a handful of these agates in the palm of my hand. We’ll see if I’m actually able to use these in my day to day world. Maybe hang one from the window by a string. Mount one in a piece of jewelry that will be worn infrequently. Who knows, maybe one will become something I hold in my regularly, rubbing, soothing me as I do what I do as the API Evangelist each day.
Even with all of this (potential) usefulness I am finding in today’s bot waves, I’m still reminded of the power of the ocean. The dangers of sneaker waves while I’m heads down looking for agates. The power of the undertow while swimming on a sunny day. The ability for thousands of waves to come in and take away the beach, the bluff, and destroy the house I’ve built, as well as my neighbors. I’m reminded that no matter how shiny each gem is that I find in the waves, or how much I love the sound each wave crashing on the beach, but I can never stop thinking about the power of the ocean at scale. I mean, as our president recently point out, we are all “Surrounded by water. Big water. Ocean water.” Let’s not be distracted by each wave, and make sure we are always paying attention to the bigger picture. I feel like we have to do this for the “kids” as well, so the waves don’t sneak up on them.
You won’t find me talking about the acquisition of API startups very often. I’m just not a fan of the game. I am not anti-venture capital, but I find the majority of investment in the API startup ecosystem works against everything we are trying to do with APIs. In my opinion, VC investment shouldn’t be the default, it should be an exception. There are other ways to build a business, and I see too many useful API tools get ruined while playing this game. With that said, I tend to not cover the topic, unless I get really pissed off, or the occasional investment or acquisition that I feel will result in a positive result.
Last week we saw the Runscope acquisition by CA. This is an acquisition that doesn’t leave me concerned. Runscope is a partner of mine, run by people I know and care about, and they offer a tool that is useful in the API sector. If they’d had been acquired by many other bigcos I would have been more concerned, or even upset (if it had been certain ones). However, I have experience with CA, and while they are an enterprise beast, I’ve seen them make acquisitions before that weren’t damaging to the services and tooling they acquired. I trust that CA isn’t acquiring Runscope to just eliminate a strong player from the sector, and that they are actually interested in what Runscope does.
I have seen CA’s role in the API space through the lens of the API Academy team, as well as through public and private conversations with other CA employees, on a variety of other teams. I’ve gone on-site and participated in API training session, and I have seen evidence that CA is invested in helping evolve their enterprise to be an API aware organization. Something that you can see reflected in how they approach doing business with their customers. I’m currently working to help move forward some API curriculum with the API academy team, which wouldn’t be happening if I didn’t feel they were committed to helping invest in API literacy across the API space.
The CA acquisition of Runscope doesn’t leave me nervous. I feel like it is a good match. Also, despite the CEO of Runscope, John Sheehan and I often butting heads about startup and VC culture, I feel like he has played the game in an honest and respectful way. He’s made the best choices he could have as a CEO in this game. He cares about making a high quality, useful API product. He genuinely cares about the API space. Even though I think he loves the startup and investment game a little more than he should. All of this leaves me without the indigestion that API startup investment and acquisitions usually leaves in my stomach. I don’t feel like we are losing yet another valuable tool. I feel like CA will be a good steward of Runscope, and the team will actually get the opportunity to evolve, grow, and do better things.
Nice work y’all! Here is to everything being 200 OK!
I have been helping my partner in crime Audrey Watters (@audreywatters) evolve her data work as part of her Columbia Spencer Education Journalism Fellowship, where she is publishing a wealth of ed-tech funding data to Github. I worked with her to evolve the schema she is using across the Google Sheet, and YAML data stores she is using. Something that will autogenerate APIs (well dynamic JSON) based upon the filename, and the fields she chooses as part of her data stores. I just planted the seeds, and she has been cranking away creating repos, and building data stores since this last summer.
She mentioned to me recently that she thought she had been being consistent in her naming conventions across her work, but had recently noticed some inconsistencies–realizing the importance of a consistent design and schema across the projects, something that really could become problematic at scale if she hadn’t caught. Luckily she was able to fix with some work, and was back on track. She isn’t as automated in the replication of data across her projects, but that is a good thing. It is forcing her to think more deeply about the naming and overall design of her static data APIs, which she uses across many repos, and displayed in a variety of lists, outlines, and stories she is telling around her work.
Audrey has spent seven years listening to me talk about API design blah blah blah, but until she was working with her own data, that she cared about, she didn’t fully grasp some of the API design and implications of working with the access, reusability, and maintenance of data at scale. I’ve offered to automate more of the maintenance, replication, and standardization of data across her repos, but she’s declined. She said she finds it valuable to work with the design, and naming of her data stores, for us in different projects. She likes keeping here YAML data stores in separate repos, and then working with them individually in specific uses cases. As part of her work, she has a master ed-tech investor data store, and API of investors behind each ed-tech company, but then when she aggregates for her wider ed-tech funding, she replicates and names (or renames) it to fit that project.
The work that she is doing is what I consider static API design, where the data is YAML or JSON on Github, but then each project dynamically generates JSON, XML, CSV, or RSS using Liquid, and then also generates HTML UI elements using Liquid as well. It’s not full blown API design, and deployment, but the same API definition, schema, and design concerns come into plays, because if she isn’t thoughtful, and consistent, she will feel the pain at some point at the client level (Liquid/Jekyll). Also, since she is doing this across so many repos, at some point she will begin feeling the pain at a pretty significant scale. However, since she actually cares about the data she is managing, it is important for her to take the time to do it right, and not always opt for automation, so that she can make sure she gets the design and schema details right. Something I wish more API data stewards would realize.
I am preparing my Schema.org Github repo with a variety of data sources for use across my API tooling and other projects. I’m trying to get better at using a common vocabulary, and not reinventing the wheel each time I start a new project. Schema.org has the most robust vocabulary of shared schema available today–so I am using this existing work as the core of mine.
I am slicing and dicing the schema.org vocabulary into several formats that I can use in my OpenAPI-driven editors, and other tooling. I took the JSON-LD representation for Schema.org, and published it as a simpler JSON schema definition format that can be applied quickly to an OpenAPI. It isn’t perfect, and you lose a lot of the semantics in the process, but I think it still provides an important base for API designers, architects, and developers to use across their OpenAPI.
It is pretty verbose, with over 150K lines, but it provides a fairly consolidated view of Schema.org classes, in a single set of definitions:
You can download a copy via the Gist, or you can find as JSON and YAML in my Github repository for this work. I’m going to be creating complete OpenAPI for each Schema.org class, as well as individual JSON schema files for each class. I just haven’t to figure out how to decouple them into individual files, yet containing all the relevant schema. I have the code, I just need to dial it in, when I have more time.
I am going to use this Schema.org JSON schema as an autocomplete in my API design tooling, and using the OpenAPI as the source definition for my API deployment and testing tooling. I’ve been evolving my Human Services Data API work to easily generate server side code using OpenAPI, and I’m going to use the same code base to generate any Schema.org API, and deploy as AWS EC2 instance. I’m not looking to develop a SaaS solution, but a quick deploy solution for my own work, and projects I work on with my clients. As I work with more, I will validate that each of these definitions are 100% correct, and properly represent the Schema.org vocabulary.
I know people don’t understand why I’m so obsessed with APIs. Sometimes I ask the same question. When I began in 2010, it was 75% about my belief in the good that APIs can do, and 25% about pushing back on the bad things being done with APIs. In 2017, it is 15% about the good, and 85% about pushing back on the bad things that APIs can do. API driven platforms are being used for some pretty shady things these days, and increasingly they are a force for disruption, and not about making the world a better place.
With this in mind, I wanted to take a moment to highlight the API stack right now that is being used to disrupt the world around us. These are the APIs that have shifted the political landscape in the U.S., and are being used to replicate, automate, and scale this disruption around the world.
- Facebook - The network effect is what brings the troublemakers to Facebook.They are on pace to have 2 billion active users. Something that has the potential to create quite a network effect when sharing stories and links, and when you seed that, target it, and grow it using the Facebook advertising engine–it makes for an excellent engine for disruption.
- Twitter - Twitter is a different beast. Less of the mainstream population than Facebook enjoys, but still a sizable, and very public audience. You can use the Twitter engine to spin things up, get people sharing, do some of the same sharing of stories and links, seeding, targeting, and growing with advertising. Often times the viral nature will spread to Facebook on take on a life of its own.
- Reddit - Now Reddit is entirely just an organic engine for disseminating information, which makes it great for propaganda, everything fake, and stoking the haters. The network effect that is Reddit, works very, very well will Twitter and Facebook, making for a perfect storm of virality that can spread like wildfire.
- WordPress - WordPress is where the news, and other websites get published. Because WordPress is an open source solution, it can be installed anywhere. It can be installed as many times as you want, with no costs beyond your hosting. Each of those installations have an API, which allow you to easily publish across hundreds or thousands of installations. When you slap advertising on these beasts, and plant your seeds across Facebook, Twitter, and Reddit, you make for a pretty efficient propaganda machine.
- Google - Google is the advertising engine for use on the WordPress sites. Google Adwords and Adsense are where disruptors buy the ads they need to plant seeds, but more importantly, it is how they generate revenue from the click throughs, and page views generated from the Facebook, Twitter, and Reddit network effects. Beyond advertising, the Google index provides another great way for the message to spread. All you have to do is play the SEO game, which is something that is greatly aided, and gamed, by the inbound traffic received from Facebook, Twitter, and Reddit.
This is a pretty simplistic snapshot of the API surface area that is being used to mess with our realities right now, but if you break down the number API resources available for each of these platforms, you begin to see all the knobs and dials that can be turned at scale, to disrupt the world. I’ll write another post that walks through all the paths and endpoints, but I want to look at the problem from the 100K view, and point the finger at APIs. Without them, such a small group of people wouldn’t be able to do such a huge amount of damage.
The key driver of disruption here is the advertising engine brought to the table by Google, Facebook, and Twitter, which allows troublemakers to sew the seeds of destruction, as well as target, and grow revenue that supports their efforts. Secondarily, the network effect of Facebook, Twitter, and Reddit allow for the larger demographics to be stirred up, and mobilized, and potentially completing the loop for very disruptive memes. When advertising alone can’t drum up the attention needed, the API driven bot armies on Twitter, Facebook, and Reddit pick up the slack, get to work voting up, liking, sharing, until each platforms algorithms get triggered, and do the rest of the work. Not every message will enjoy the virality necessary to do damage, but every once in awhile the disruptors will hit the big time, and there message will take on a life of its own, with each platform doing the rest of the work for them.
All of this would be possible without APIs. However, it would take armies of people to do. APIs are the bullhorn, the amplification, the automation needed to scale this type of disruption. In my next post I will publish a complete list of the knobs and dials that can be turned by the disruptors, to crank up the volume on their campaigns, and to direct their bot armies in support of a specific message, or to attack a target. All of the advertising and targeting engines for the platforms above have APIs, allow campaigns to be automated, and scaled for both spending, and generating revenue. WordPress APIs make the open source platform into a kind of printing press for the propaganda machine, with Facebook, Twitter, Google, and Reddit APIs acting as the messenger boys. APIs are the key element, that makes all of this seem so deafening at the moment.
APIs are not evil, nor are they good, or even neutral. They are just a tool. They only do what the platform operators design them to do, and what the API consumer decides to put it to work doing. This is what has made APIs so interesting to me. When you set this stage, the possibilities for innovation have been great. However, recent history has shown, when advertising is the main revenue engine for both the platforms and consumers, the API providers are more than willing to ignore and look the other way at what is going on. This has made for a perfect storm of disruption at a scale we’ve never seen before, and because of the technical complexity, it is something that most people don’t even see happening. They can’t see the strings. They can’t see how their world is being disrupted. They don’t see their role in it. They don’t understand that things are being amplified and algorithmically distorted–they just think that is the way the world is.
After seven years of telling stories on API Evangelist I’ve had to repeat myself from time to time. Honestly, I repeat myself A LOT. Hopefully I do it in a way that some of you don’t notice, or at least you are good at filtering the stories you’ve already heard from your feed timeline. My primary target audience is the waves of new folks to the world of APIs I catch with the SEO net I’m casting and working on a daily basis. Secondarily, it is the API echo chamber, and folks who have been following me for a while. I try to write stories across the spectrum, speaking to the leading edge API conversations, as well as the 101 level, and everything in between.
Ask anyone doing API evangelism, advocacy, training, outreach, and leadership–and they’ll that you have to repeat yourself a lot. It is something you get pretty sick of, and if you don’t find ways to make things interesting, and change things up, you will burn out. To help tell the same story over and over I’m always looking for a slightly different angle. Let’s take API Meetups as an example. Writing a story about conducting an API Meetup has been done. Overdone. To write a new story about it I’ll evaluate what is happening at the Meetup that is different, or maybe the company, or the speaker. Diving into the background of what they are doing looking for interesting things they’ve done. You have to find an angle to wrap the boring in something of value.
API documentation is another topic I cover over, and over, and over. You can only talk about static or interactive API documentation so much. Then you move into the process behind. Maybe a list of other supporting elements like code samples, visualizations, or authentication. How was the onboarding process improved? How the open source solution behind it simplifies the process. You really have to work at this stuff. You have to explore, scratch, dig through your intended topic until you find an angle that you truly care about. Sure, it has to matter to your readers, but if you don’t care about it, the chances of writing an interesting story diminishes.
This process requires you to get to know a topic. Read other people’s writing on the topic. Study it. Spin it around. Dive into other angles like the company or people behind. Spend time learning the history of how we got here with the topic. If you do all this work, there is a greater chance you will be able to find some new angle that will be interesting. Also, when something new happens in any topical area, you have this wealth of knowledge about it, and you might find a new spark here as well. Even after all that, you still might not find what you are looking for. You still end up with many half finished stories in your notebook. It is just the way things go. It’s ok. Not everything you write has to see the light of day. Sometimes it will just be exercise for the next round of inspiration. That hard work you are experiencing to find a good story is what it takes to reach the point where you are able to discover the gems, those stories that people read, retweet, and talk about.
I was having one of my regular calls with the Tyk team as part of our partnership, discussing what they are up to these days. I’m always looking to understand their road map, and see where I can discover any stories to tell about what they are up to. A part of their strategy to build awarness around their API management solution that I found was interesting, was the API Surgery event they held in Singapore last month, where they brought together API providers, developers, and architects to learn more about how Tyk can help them out in their operations.
API surgery seems like an interesting evolution in the Meetup formula. They have a lot of the same elements as a regular Meetup like making sure there was pizza and drinks, but instead of presentations, they ask folks to bring their APIs along, and they walk them through setting up Tyk, and deliver an API management layer for their API operations. If they don’t have their own API, no problem. Tyk makes sure there are test APIs for them to use while learning about how things work. Helping them understand how to deliver API developer onboarding, documentation, authentication, rate limiting, monitoring, analytics, and the other features that Tyk delivers.
They had about 12 people show up to the event, with a handful of business users, as well as some student developers. They even got a couple of new clients from the event. It seems like a good way to not beat around the bush about what an API service provider is wanting from event attendees, and getting down to the business at hand, learning how to secure and manage your API. I think the Meetup format still works for API providers, and service providers looking to reach an audience, but I like hearing about evolutions in the concept, and doing things that might bring out a different type of audience, and cut out some of the same tactics we’ve seen play out over the last decade.
I could see Meetups like this working well at this scale. You don’t need to attract large audiences with this approach. You just need a handful of interested people, looking to learn about your solution, and understand how it solves a problem they have. Tyk doesn’t have to play games about why they are putting on the event, and people get the focus time with a single API service provider. Programming language meetups still make sense, but I think as the API sector continues to expand that API service provider, or even API provider focused gatherings can also make sense. I’m going to keep an eye on what Tyk is doing, and look for other examples of Meetups like this. It might reflect some positive changes out there on the landscape.
Disclosure: Tyk is an API Evangelist partner.
This post is from the latest copy of my API Evangelist API Design Industry Guide, which provides a high level look at the API design layer of the industry. Providing a quick look at the services, tools, and some of the common building blocks of API design. The guide is heavily rooted in REST and hypermedia, but is working to track on the expansion of the space beyond just these formats. My industry guides change regularly, and I try to publish the articles from them here on the blog to increase their reach and exposure.
GraphQL is a query language designed by Facebook to build client applications using a flexible syntax and provide a system for describing the data requirements and interactions required by each application. GraphQL began as a Facebook project that soon began powering all their mobile applications. By 2015, became a formal specification. GraphQL provides a query language for your APIs that allows users to describe how they would like their API requests be fulfilled. The approach shifts the API design process to be more about request flexibility requiring API providers to design all API paths ahead of time. It opts for an augmented query language over investing in static schema that requires specific API paths.
REST APIs focus on paths to your resources, but GraphQL is all about fields and data types, with everything accessed through a single API path. GraphQL does a better job of providing a more comprehensive approach access to data stored in a database by offloading design to the query layer for interpretation at query render time. The ability to define what data is returned opens up some interesting approaches to delivering resources, especially when it comes to potentially constrained network environments.
When it comes to providing access to data used in responsive web and mobile applications, GraphQL can be successful in allowing application developers to get exactly what they need for an interface and nothing more. This can increase performance and give UI / UX designers more of a voice in what an API does. GraphQL has played a significant role in the evolution of React, Facebook’s open source framework for deploying user interfaces. React is well-known has achieved some significant traction in application development circles. This design approach to delivering data using APIs a natural fit for rapidly delivering web and mobile apps.
GraphQL has seen some significant adoption beyond Facebook, notably at Github and Pinterest. GraphQL strengths become clear when it is used to deliver complex data stores quickly and efficiently by developers that require a greater level of control what data they need. While GraphQL is not traditional API design, it is an important design constraint to consider when planning the future of your API design practices and toolbox.
This post is from the latest copy of my API Evangelist API Design Industry Guide, which provides a high level look at the API design layer of the industry. Providing a quick look at the services, tools, and some of the common building blocks of API design. The guide is heavily rooted in REST and hypermedia, but is working to track on the expansion of the space beyond just these formats. My industry guides change regularly, and I try to publish the articles from them here on the blog to increase their reach and exposure.
gRPC is a high-performance open source remote procedure call (RPC) framework that is often used to deploy APIs across data centers that also supporting load balancing, tracing, health checks and authentication. While gRPC excels in more controlled, tightly coupled environments, it is also applicable for delivering resources to web, mobile, and other Internet connected devices.
When crafting gRPC APIs, you begin by defining the service using Protocol Buffers, a language and toolset for binary serialization that has support across 10 leading programming languages. Protocol Buffers can be used to generate client and server stubs in these programming languages with tight API/client coupling — delivering a higher level of performance than your average REST API and SDK can.
gRPC API design patterns takes advantage of HTTP/2 advances and uses authenticated bi-directional streaming to deliver APIs that can be scaled to millions of RPC calls per second. Its an effective approach for larger, more demanding API platforms that have begun to see the performance limitations of a more RESTful API design approach. gRPC is not ideal for every API implementation, but is definitely an approach providers should consider when high volumes anticipated, especially within the data center or other tightly controlled environment.
Google has been using gRPC internally for over a decade now, but has recently committed to delivering all their public APIs using gRPC in addition to RESTful APIs, demonstrating that the API design patterns can coexist. This approach makes it a welcome addition to any microservice style architecture. It has the added benefit of API management features like authentication, tracing, load balancing, and health checking that are required to deliver high performance.
gRPC is definitely more of an industrial grade API design pattern, shifting APIs into the next gear when it comes to performance. It also leverages the next generation of the HTTP protocol, HTTP/2. While not an API design pattern that every API provider will be working with, they should be aware it exists so that they understand what it is and the role it plays in the space.
I was hanging out with my friend Mike Amundsen (@mamund) in Colorado last month and we ended up discussing folks uncertainty with APIs. You see, many folks that he has been talking to were extremely nervous about all the unknowns in the world of APIs, and were looking for more direction regarding what they should be doing (or not doing). Not all people thrive in a world of unknown unknowns, and not even in a world of known unknowns. Many just want a world of known knowns. This is something that makes the API landscape a very scary thing to some folk, and world where they will not thrive and be successful unless we can all begin to find a way to help them understand that this is all a journey.
I love figuring all of this API stuff out, and I know Mike does too. We like thinking about the lofty concepts, as well as figuring out how to piece all the technical elements together in ways that work in a variety of business sectors. Many folks we are pushing APIs on aren’t like us, and just want to be told what to do. They just want the technology solution to their problem. A template. A working blueprint. It freaks them out to have so many options, possibilities, patterns, and directions they take things. I feel like we are setting folks up for failure when we talk them into embarking on an API journey without the proper training, equipment, support, and guidance.
I think about the last seven years doing this, and how much I’ve learned. Realizing this makes me want to keep doing APIs, just so I can keep learning new things. I thought I understood REST when I started. I didn’t. I thought I understand the web when I started, I didn’t (still don’t). I was missing a lot of the basics, and no matter what folks told me, or how precise their language was, I still needed to bang my head on something over and over before I got it. I was missing a significant amount of why hypermedia can be a good approach without truly understanding content negotiation, and link relations. Realizing how much I still need to explore and learn has only emboldened me on my journey, but I’m not convinced this will be the case with everyone. We are wrong to assume everyone is like us.
As technologists and autodidacts we often overestimate our own ability, as well as what others are capable of. We realize APIs are not a destination, but a journey. However, we suck at explaining this to others. We are horrible at understanding all of the stepping stones that got us here, and recreating them for others. I put myself into this group. I think about this stuff full time, and I still regularly stumble when it comes to on-boarding folks with what API are, and properly helping them in their journey. I still do not have a proper set of on-boarding lessons for folks, beyond my API 101 stuff. I talk a lot of talk about the API life cycle, the API economy, and all the business and politics of APIs, but I still can’t point folks to where the yellow brick road is. We have to get better at this if we expect folks to ever catch up.
This is one reason I feel Zapier, and other iPaaS providers are so important. We should be helping people understand APIs and integration in context of the problems they are trying to solve, not in terms of REST, SDKs, or any of the other technical jargon we spew. With Zapier, folks can play with Zaps (recipes) that deliver meaningful API integration that actually solve a problem in their world. They can play with what is possible, without learning all the technical pieces first. They can evolve in their business world, while also progressing on their API journey. IDK. I’m just trying to find ways to help folks better understand what APIs are. I’ll never make everything known to them, but I’m hoping that I can help make folks a little less nervous about the known unknowns, and who knows maybe some day they’ll feel brave enough, and confident in their API awareness that they’ll be able to operate in a world of the unknown unknowns, and settle in on the perpetual journey that are APIs.
This post is from the latest copy of my API Evangelist API Design Industry Guide, which provides a high level look at the API design layer of the industry. Providing a quick look at the services, tools, and some of the common building blocks of API design. The guide is heavily rooted in REST and hypermedia, but is working to track on the expansion of the space beyond just these formats. My industry guides change regularly, and I try to publish the articles from them here on the blog to increase their reach and exposure.
Restlet began as an open source Java API framework over a decade ago and has evolved into an API studio, client, and cloud platform with an API design core. At the center of the API lifecycle management platform is its API designer which gives you a visual view of an API and an OpenAPI or RAML view, providing a machine readable accounting of each API’s contract.
The Restlet Studio allows you to design and document your APIs, starting from scratch, or import existing API design patterns using OpenAPI for RAML. Using the Restlet design UI you can shape the paths, parameters, headers and complete requests and responses for any API. Then, take the definition and actually put it to work in development, staging, or production environments.
Restlet demonstrates how API design is more than just a momentary phase where you are developing APIs and is actively defining every stop along the API lifecycle from design to deprecation. While designing an API in the Restlet API Studio, you can also work to test and automate using the client, helping ensure a usable and complete API is designed. The Restlet Client provides a dashboard to verify the desired API contract in a way that can be shared across teams, with clients, and across stakeholders.
Once the API design process has matured and evolved and is ready for deployment, Restlet empowers production deployment by, generating server and client side code with documentation and a landing page for consumers to access and put an API to work. The Restlet Cloud provides all the components you need to quickly deploy, manage, and scale an API, while using the API design studio as the central place where truth around the API is defined — touching every other aspect of API operations.
Less of a plug for Restlet, I am hoping it is a demonstration of how API design is central to every aspect of API operations and can be central to API service providers. API design isn’t just about the technical design of the surface area of API requests and responses. It is about designing and defining all aspects of doing business using APIs.
We are getting closer to APIStrat in Portland, Oregon, October 31st through November 2nd. So I’m going to keep crafting stories that help convince you should be there. It is the first APIStrat conference as an OpenAPI event, operated by the Linux Foundation events team. Steve and I are still playing a big part, and will be MC’ing, but like OpenAPI, APIStrat has grown to the point where we need to let it become more than just something Steve, myself, and the 3Scale team can execute by ourselves.
APIStrat has always been a place where we gather and talk about OpenAPI, going back to when it was affectionately known as Swagger. Tony, and the team have spoken before, and there has been many other sessions, workshops, and keynotes involving the API specification format. This APIStrat is going to be no different, but there will be an even heavier presence for the specification. Since Tony Tam is stepping away, we are giving a full hour on mainstage for him and folks involved in the evolution of OpenAPI to share their story. Darrel Miller will be holding also be holding a workshop on the first day, where several folks involved in the OAI will be sharing knowledge.
There will also be an OAI booth presence, and I know that Jeff ErnstFriedman will be present for OAI membership discussions. If your company is investing in OpenAPI as part of your API operations, and developing tooling around the specification, you should be considering joining the OAI. Take a look at the current membership list. I’m a member, and so are other heavy hitters like Adobe, Google, Microsoft, IBM, and even my partner in crime 3Scale, and Tyk are present. As a member you get in on the Slack channel conversations, participate on marketing and governance calls, and you get invited to participate on the APIStrat crew (if you want).
Let me know if you are interested becoming a member, I can hook you up with Jeff. He’s the man. If you’d rather, head over to APIStrat registration and get signed up to join in on the conversation in Portland. I can make sure you get some dedicated time with Jeff there, and he can make sure you get hooked up as a member. If you are looking to be part of the conversation that is APIStrat, and help guide the direction OpenAPI is headed, this is where you need to be. So, I’ll see you in Portland next month, and look forward to hearing what you are up to with your API strategy.
You hear a lot about doing APIs at scale in our space. Many folks dismiss web APIs because they feel they won’t scale, and aren’t performing at the scale they envision. The majority of these discussions focus on how do you scale large operations of Twitter, Facebook, or Google scope. A single organization operating API infrastructure at scale, distributed across many geographical regions, supporting millions of users. There are plenty of discussions going on regarding the technology, business, and politics of doing APIs at this scale. I find myself thinking in similar ways, but more federated version of this, where the latest technology might not always be the right answer.
My Human Services Data API (HDSA) work is the best example I have of this. Where I’m having to keep the technology, and API definition bar as low as possible to onboard as many people as I possibly can, but then eventually, be able to aggregate large amounts of data across many federated instance. I have 3,144 counties, and 19,354 cities to consider. They should all be speaking a common schema when it comes to the sharing of human services data. Something that is easier said, than done. When you get on the ground you realize many of them are stuck in 1990s, or early 2000s edition of the web, and just do not have the resources needed to move things forward. They can’t afford the latest SaaS service, and they can’t drop the ball, or thousands, or millions of people will suffer–the stakes are high.
When I go into large companies, who have a large teams, and significant number of resources, the conversation around scale is much different. Sure, there is distributed scale. Sure, there is volume scale. However, most times the distribution and volume exists within a single company or organization. A single command and control structure. However, I’m talking about federated distribution and volume, with no single command and control structure. I’m facing not just how you deliver technology across all these nodes, but how do you train, consider extremely short budgets, and other aspects of ensuring things get done consistently, reliably, without disruption. Cities aren’t the only example of this in our world. I’m also seeing the same across state and federal agencies, as well as k-12, and higher educational institutions. Again, where the technological bar is low, but the stakes are high.
This real world is just a different game than tech culture likes to admit. I can’t always take what I learn in the tech sector and immediately apply on the ground in the mainstream worlds. Some of it applies, but honestly I’ve had to throw a lot of seemingly sensible decision out lately, opting for much simpler, web based solutions, that I’m confident staff can be trained on, and can realistically be implemented in existing IT environments. If we keep moving fast and breaking things, and some point we are going to have to stop, pick up some of the pieces, and help take care of the folks we’ve left behind.
Current API design focusses on using schema to help quantify the payload of the request and response structure of our APIs. JSON Schema, MSON, and other data specifications have emerged to help us quantify the bits we are passing back and forth with APIs. Alongside this evolution, another data format has emerged to help us define simple descriptions of our application-level semantics, similar to how we are using HTML microformats to share data on the web, Application-Level Profile Semantics (ALPS).
ALPS goes well beyond schema, which provides a representation of a plan or theory in the form of an outline or model. ALPS provides a way to define the meaning behind the data, content, and other resources you are making available via an API. ALPS seeks to establish a shared understanding by illuminating the meaning behind hypermedia interfaces (data and state transitions) such as HTML, Collection+JSON, HAL or Siren. It encourages reusability of common profile documents across the media types we are depending on.
Using ALPS you can easily define the common data elements we all use in our API like contacts, todo lists. It can even describe the structure of our APIS for verbose and more useful error responses. What really matters is that you can also define the transitions surrounding these data elements. You can get at the meaning and use behind them, like rolling dice, or playing with a deck of cards. It’s much more than just metadata describing the data elements at work.
If we want our APIs designs to reflect the meaning and interactions around the valuable resources we are serving up, we need to work hard to make sure we are all using common data formats and schema. Schema.org provides us with a good start, but we also need to invest in more in ALPS registries, directories, and dictionaries. These provide machine readable definitions of the common data elements exchanged between systems and applications, as well as the meaning, relationships, interactions, and transitions that make these data elements valuable in our digital worlds.
ALPS has been submitted as an Internet Engineering Task Force (IETF) draft, and provides one possible standard to consider when looking to define the semantics behind API operations. Visit: http://apis.how/alps-io/ for more information.
I had a scare this last weekend regarding my Github infrastructure. My Github organization for API Evangelist was flagged as SPAM and taken down. The Github organization contains almost 100 repositories that I use across my platform. These repositories drive the public side of my research, but also contain YAML files that are used in automation across my entire platform, and network of websites. At about 12:00 PM on Saturday, everything came to a screeching halt, with all the data I depend on to make things go around becoming unavailable.
I have backups of all the data, and the website templates that produce the public side of API Evangelist. I also have a plan B in place for setting up a Jekyll instance that runs on Amazon EC2, but I hadn’t ever actually ran any drills on plan B. After submitting a ticket to Github, I got to work firing up the AWS EC2 instance, and unloading and unpacking the almost 100 website backups for my API Evangelist research. After getting things setup, and as I was preparing to switch over the DNS, I got an email from Github saying:
Sorry for the hassle! It appears your organization had been caught up in a spam filter and was flagged incorrectly. I’ve cleared that flag now, so your account should be back to normal. You shouldn’t see that message again, but let me know if I can help with anything else!
Crisis averted. Luckily this was just my own company Github organization. I operate numerous other API developer portals, code repositories, documentation sites, and other API related projects and tooling that lives entirely on Github. If my personal account was frozen, or any of these organizations taken offline, I would have been in a lot more hot water, and accountable to my clients. Overall I was down for a little over six hours. It showed me the fragile nature of depending on Github, not just for my data driven project public presence, but also it being the center of so many workflows that depend on the YAML, JSON, and code that I publish there.
The experience has forced me to look at my backup process some more, and given me the opportunity to actually run a live drill on failing over to a secondary provider. Even with this failover I still wouldn’t have the Github API available as part of these data project workflows, an API that I depend on, which I really cannot replace. I can replace Git, and Jekyll, but not the Github API portion of the orchestration I depend on so heavily. I have to do some more meditating on this dependency in my world. Don’t get me wrong. I love me some Githubz, but this worries me. Any API dependency worries me if I can’t easily replicate and replace. But, I guess this is what the API game is often about right? Establishing dependencies that are difficult or impossible to walk away from. ;-(
I am spending a lot of time lately thinking about event sourcing, evented architecture, real time, and webhooks. I’m revisiting some of the existing aspects of how we move our bits around the Internet in real time and at scale as part of existing conversation I am having, as well as some projects I’m working on. I recently wrote about making sense of API activity with webhook events, and as I’m crafting a list of meaningful events for my Human Services Data API (HSDA) work, I’m thinking about how these events reflect the value that occurs via API platforms.
As I’m going through the different APIs I’m exposing via a platform, I am working to identify and catalog events in which folks can subscribe to using webhooks. These are the events that occur, like adding a new organization, updating a service, or completing a batch import–all the things people will care about the most. These are the events and activities that occur because their is an API, which have the most value to API consumers, and platform operators. This is what actually matters, and why we are doing an API in the first place, to enable these events to occur. The more these events are triggered, and the more people we have subscribing and engaging with these events, the more value that is generated using an API.
In aggregate, using modern approaches to API management, we might provide analytics and reports that demonstrate all this value being created, to justify the existence of our API. In some implementations, this value created is how we might be charging our API consumers, partners, and other stakeholders. However, in some cases we might even considering paying API consumers when these events occur, incentivizing a certain event-driven behavior that benefits the platform. It is easy to think of API value generation simply as the number of API calls, but I think webhooks has helped establish a new way to look at how value is generated, based upon the number of subscribers to any particular event, or possible a type of of event.
I feel like this is one of the reason we are finally seeing more investment in event sourcing, and evented architecture, and the real time streaming of data and content. The events that matter are getting prioritized, and the technology is advancing to support these events that matter. IDK. As I push forward with my webhook research, and revisit my real time API research, and expand into new realms of messaging and focusing on events, I’m rethinking how we measure and quantify value generation via API platforms. For a long time the measure has been number of API consumers, and the number of API calls, but I feel like things are shifting to the types of events that occur, and how meaningful these events are to API providers, consumers, and their application end-users.
I’m encountering more API providers who have performance and scalability concerns with their APIs, who are making technical procurement decisions (gateways, proxies, etc) based upon these challenges, but have not invested any time or energy into planning and optimization of caching for their existing web servers that are delivering their APIs. Caching is another aspect of HTTP that I keep finding folks have little or no awareness of, and do not consider more investment in it to assist them in alleviating their scalability and performance concerns.
There was a meeting I attended a couple weeks back where an API implementation was concerned about a new project for bulk loading and syncing of data between multiple external systems and their own, because of the strain it put on their database. Citing that they received millions of website, and API calls daily, they said they could not take the added load on their already strained systems during the day, limiting this type of activity to a narrow window at night. I began inquiring regarding caching practices in place on web, and API traffic, and they acknowledged that they new of no such activity or practices in place. This isn’t uncommon in my experiences, and I regularly encounter IT groups who just don’t have the time and HTTP awareness to implement any coherent strategy–this particular one just happened to admit it.
My friends over at the API Academy have a great post on caching for RESTful and Hypermedia APIs, so I won’t be addressing the details of HTTP, and how you can optimize your APIs in this way. API caching isn’t an unproven technology, and it is a well known aspect of operating on the web, but it does take some investment and awareness. Like API design in general, you have to get to know the resources you are serving up, understand how your consumers are putting these resources to work, and adjust, dial-in, and tweak your caching strategy. It is something that gets incrementally harder, the more time zones you operate in, but with some investment you can significantly increase the scalability of your APIs, the performance of properly cached paths, and do more with less resources. Scaling the size of your server isn’t always the first sensible thing you should be doing, a coherent caching strategy will be a much wiser and cost-effective approach in the long run.
A lack of API caching strategy amongst my clients and readers has a damaging effect on API operations. However, I’d say the most damage done isn’t by the lack of a strategy, it is the reverberating decisions made around the inability to properly scale, and deliver the performance API clients are needing. I see many technology procurement decisions being made where scalability and performance are a major part of the conversation and decision making process. Where conversations around API caching have never occurred. This is just lazy. This is just ignoring one of the key tenets of what makes the web work. This is just investing in technical debt, over making sensible architectural decisions, and spending the time to get to know the resources you are serving up, and how your customers are using them. Learning about HTTP, and caching does take some investment and planning, but it is nowhere the investment and planning that will be required to unwind the technical debt you’ve acquired made from the other bad technology purchasing decisions you’ve made along the way.
Arnaud Lauret (@arno_di_loreto), the API Handyman (@apihandyman), has been developing an API Stylebook that provides a collection of resources for API designers. It is a brilliant aggregation of thirteen API design guides from Atlassian, Cisco, Cloud Foundry, Google, Haufe, Heroku, Microsoft, PayPal, Red Hat, The White House, and Zalando. It highlights best practices used by leading API providers.
“The API Stylebook aims to help API Designers to solve API design matters and build their API design guidelines by providing quick and easy access to selected and categorized resources”, says Lauret. A unique community resource, it provides deep linking to specific topics within publicly available API design guidelines. Instead of reinventing the wheel or searching Google for hours, API designers quickly can find solutions and inspiration from these existing best practices.
More than just a list of guidelines, it is a machine readable distillation of the thirteen API design guides into a master list of API design topics you can consider when crafting your own API design guide. It is slick. I like Arnaud’s approach to analyzing the existing API design patterns across the API platforms who have shared their guides. I also really like the YAML approach and it’s presented as a very good looking website using Github, and Github Pages.
This is how API literacy tools should be constructed and it provides a valuable lesson in API design. You can take that lesson and execute what you’ve learned along the way, with a very hands-on process. Using the API Stylebook, you can craft an API design guide for your team to follow and employ across API operations. Anyone can fork the API Stylebook, pick and choose the best practices across the thirteen API design guides, and then publish a version as a Github repository. The resulting repo can easily be included in your API docs, as a standalone website.
There are two things going on here. First, API providers are sharing their views on API design best practices — important stuff. Second, an aggregator (API Handyman) makes these best practices machine-readable, forkable, and reusable. This knowledge layer of API operations becomes even more valuable when it is openly accessible and shareable in this way. We need our APIs to employ common patterns and speak in common formats so that they’ll work together and reduce friction in the industries where they being put to work. API Stylebook approaches API design through continuous integration, allowing API development groups to define API methodologies that can be deeply integrated into our lifecycle.
The more API design guides that are sourced in the API Stylebook, the better the available topics will become. Arnaud will be adding new API design guide to the stack. It is better for the community if you start with his existing list of topics and add your own YAML definitions that drive the API Stylebook on Github, but we’ll take what we can get. It is a process that every company, organization, institution and government agency should go through and what you learn along the way is essential for any API team.
API design isn’t just knowing the details of REST or any single architecture pattern. It’s the knowledge you gain from the definition process, and a huge part of this process is having an effective pool of community knowledge to pull from. Arnaud has done an amazing job at processing disparate and unstructured API design and yielding coherent list of topics that we can all consider in our own API design process. We should all be investing in (and building on) his work.It is something I’m baking into my API design research wherever I can — dovetailing the common building blocks I’ve aggregate across his very detailed work.
REST is a philosphy, not a standard. The current wave of web API success we see across the technology landscape has been about API providers building on top of the best practices of the web API providers that came before them, going back to the early pioneers of SalesForce, eBay, and Amazon. Every API architect I know has learned from reverse enginnering and studying the practices of what they’d consider to be well designed APIs, and understanding the bad design practices out there. If knowledge isn’t shared, then the next generation of API developers do not learn from what came before them. Making the practice of publishing your API design guide for your API not just good for your own API operations, but good for the entire API community.
The most significant API movements in the last five years came from Apiary in my opinion, but the most significant API movements in the next five years will be about sharing API design patterns on Github. Allowing API providers to borrow from each other, resuse, and communicate around common API patterns that are working, or not working. API definitions are allowing us to publish these common design patterns to Github, and the next generation of continous integration and deployment tooling like API Stylebook allow us to publish, share, and even test the quality of our APIs.
I will be continuing to invest in API Stylebook, writing stories about it, and helping API providers understand the value of the open source project and sharing of their own API design patterns, and since it’s machine readable, I will be integrating the API Stylebook into all of my API design research.
I’m slowly learning more about Kafka, and the other messaging and data streaming solutions gaining traction in the API space. If you aren’t on the Kafka train yet, “Kafka is used for building real-time data pipelines and streaming apps. It is horizontally scalable, fault-tolerant, wicked fast, and runs in production in thousands of companies.” I’m still learning about how Kafka works, and with no real production experience, it is something that is taking time.
As part of my conversations on the subject, I was introduced to Confluent, a platform version of Kafka, which is the quickest way I have seen to get started with real-time data streams. As part of the Confluent offering I noticed they have a REST proxy, which you can find the API documentation here, and the code for the Kafka REST proxy on Github. According to the Github repo, “the Kafka REST Proxy provides a RESTful interface to a Kafka cluster. It makes it easy to produce and consume messages, view the state of the cluster, and perform administrative actions without using the native Kafka protocol or clients.”
I’ve noticed that many of the other messaging and data streaming solutions out of Apache these days have diverted from using REST, which makes sense for speed, and scale, but when it comes to reaching a wider audience I can still see the need to have RESTful API. Delivering a kind of multi-speed solution that allows developers to pick their speed based upon their skills, awareness, and need. I’m feeling like the platform approach of Confluent, combined with a RESTful layer, will give them an advantage over other Kafa service providers, or just deploying the open source solution out of the box.
REST isn’t always the most efficient, or scalable solution, but when it comes to reaching a wide audience of developers, and allowing consumers to get up and running quickly, REST is still a sensible approach. Honestly, I don’t think it is just REST, it is also about leveraging the web. Not that everyone understand the web, but I think it is what a large number of developers have been exposed to, and have been building on in the last decade. I can see high volume API solutions in the future often having a native protocol and client, but also supporting REST, and gRPC to make their solutions more accessible, performant, scalable, and quickly adopted and integrated alongside existing infrastructure.
Amazon Web Services recently updated their billing for EC2 instances to be by the second, which I really like, because I’ll fire up an instance and run for minutes, then shut things down. I’m just looking to process patent downloads, or other intensive workload projects. Beyond just EC2, the rest of Amazon’s platform is still very usage based. Meaning, you get charged for whatever you use, with unit pricing for each resource designed to compliment how it gets put to use. You get charged for the hard costs of compute, storage, and bandwidth, but you also see per message, job, entry, and other types of billing depending on the type of resource being delivered via API.
With this model for doing APIs, I’m wondering why so many API providers still have access plans and tiers. I’ve vented several times that I think service tiers are a legacy of a SaaS way of thinking and does not scale for API consumers. Maybe back when we used a handful of APIs, but the number of APIs I’m using is pushing 50 these days, and I can’t alway justify a monthly subscription to get what I need. I’m looking to just get access to valuable API resources, and get billed for whatever I use. If I don’t use anything for 6 months, I don’t get billed for anything. Also, I want to be able to run large jobs which consume intense amounts of resources without hitting tier and other limits–just charge me for what I use. If I have a $1,000.00 to spend today, let me spend it. Don’t make me jump through hoops.
I know the answer to my question regarding why so many API startups do this. It is because the resources being provided via the API isn’t the product, us API consumers are. They are looking to ensure a certain level of headcount, monthly, and annual subscribers, so that they can sell us to their investors, and ultimately whoever purchases us like cattle in the end. I’m sure there are other reasons for having pricing tiers, but this is still the primary reason we see SaaS based pricing continue to be so pervasive in an API driven world. For me, if an API provider has tiered pricing, I’ll almost always go look elsewhere. I just can’t manage 40 or 50 subscriptions, and if the number of APIs keep growing, I can’t handle 100-200 subscriptions. I just need to pay for the resources my business needs, and nothing more. I sure don’t have time to be a product in your startup cattle auction.
Pay as you go, usage based pricing is one way the cloud giants will suffocate out the small startups in the future. While startups are trying to court investors, and their acquirers, the cloud giants will just offer a competing service, drop the price to run competitors off, then once the coast is clear, raise prices back up to an profitable state. To compete, more API providers will have to go with a utility based pricing strategy, while also offering wholesale versions of their APIs in the marketplaces for AWS, Azure, and Google. I can’t help think that things might have been different if the scales didn’t tip so hard towards all of us API consumers being the product, and API providers focused more on running businesses, and catering to our real world business needs. Oh well, we got the world we have, I’ll just keep plowing forward.
When I started API Evangelist in 2010, API usage in mobile phones was the biggest factor contributing to me quitting my job, and becoming a independent voice for all APIs. I was being asked to deliver APIs to drive mobile applications on the iPhone, and while helping run technology for Google I/O I saw an increased need for resources to be delivered to this emerging platform. I knew that APIs were going to play an essential role in ensuring data, content, and algorithms could be put to use in mobile applications.
Even with the importance of mobile, it wasn’t the only reason I knew that APIs were going to be important, which is something that still resonates today. In 2007, I saw the growing importance of social media APIs, and how messaging, images, and video were being made more distributed using APIs. Then in 2008, I saw that I could deliver global infrastructure using web APIs, demonstrating that web APIs weren’t a toy, and that you could operate a real business using web APIs. Then the whole mobile thing was just the tipping point, which demonstrated that the web was maturing beyond just websites, and it would be how we’d be doing business for some time to come.
Every day I see people with blinders on focusing in on one slice of the API pie, seeing APIs as purely about commerce, social, cloud, mobile, IoT, messaging, or other growing aspect of the API economy. People are good at seeing things through the lens of their products, services, and industry. It is easy for them to ignore those people over there, or the other aspects of why leveraging the web is so important to all of this working. They get excited about a new open source solution, protocol, or pattern, and focus in exclusively on a single aspect of how we deliver technology–sometimes at the cost of other areas of their operations, or the future. If mobile is your world, and you are in the business of building top notch mobile apps, then this is the world you see.
I am in the business of web APIs. Understanding the technology, business, and politics of delivering data, content, and algorithms by leveraging the web. I’m not in the business of commerce, social, mobile, or IoT. I’m also not betting on bots, voice enablement, serverless, or the blockchain. I’m in the business of understand how the web can make all of these things work, or not work. I’m always fascinated how passionate folks get about a specific approach, and even aggressive about how they communicate about why it is better than everything else. It is also interesting how they are so willing to ignore the negative consequences, as part of their passionate belief system. You see this playing out at Facebook right now on a pretty large scale–a whole lot of unintended consequences from many good folks believing delusionally in the power of technology, and it a significant amount of pushback before they’ll change their tune.
In my seven years as the API Evangelist I’ve been tempted to keep focused on just government, or maybe just open data, or possibly social good APIs, but I know the dangers of doing this. I’m happy to support folks who are down in their silos, and don’t give them grief for not seeing the big picture. However, when folks shut me down, question my agenda, and are 100% confident they have the answers, I have to step back, and let them continue on their journey alone. There is a certain privilege that comes along with living in technological silos, and I feel like some people I know who are doing APIs in the service of mobile have their blinders on right now and are ignoring their web roots, as well as the web future. Also, they are missing out on opportunities for learning around how APIs being used on devices, the network, as part of bots and automation, voice and conversational interfaces. Remember, that APIs aren’t just for mobile, and you should be ready and open to use APIs for anything that might come along, and not ignoring the bigger picture.
I know that I make some tech companies nervous. They see me as being unpredictable, with no guarantees regarding what I will say, in a world where the message should be tightly controlled. I feel it in the silence from many of the folks that are paying attention to me at large companies, and I’ve heard it specifically from some of my friends who aren’t concerned with telling me personally. These concerns keep them from working with me on storytelling projects, and prevent them from telling me stories about what is happening internally behind their firewall. It often doesn’t stop employees from telling me things off the record, but it does hinder official relationships, and on the record stories from being shared.
I just want folks to know that I’m not in the scoop, or gotcha business. I only check-in on my page views monthly to help articulate where things are with my sponsors. I’m more than happy to keep conversations off the record, anonymize sources and topics. Even the folks in the space who have pissed me off do not get directly called out by me. Well, most of them. I’ve gone after Oracle a couple of times, but they are the worst of the worst. There are other startups and bigcos who I do not like, and you don’t ever hear me talking trash about them on my blog. Most of my rants are anonymized, generalized, and I take extra care to ensure no enterprise egos, careers, or brands are hurt in the making of API Evangelist.
If you study my work, you’ll see that I talk regularly with federal government agencies, and large enterprise organizations weekly, and I never disclose things I shouldn’t be. If you find me unpredictable, I’m guessing you really haven’t been tuning into what I’ve been doing for very long, or your insecurities run deeper than anything to do with me. I’m not in the business of making folks look bad. Most of the companies who are looking bad in the API space do not need my help, they excel at doing it on their own. I’m usually just chiming in to help amplify, and use as a case study for what API providers should consider NOT DOING in their own API operations. Sure, I may call you out for your dumb patents, and the harmful acquisitions you make, but anything I rant about is going to already be public material–I NEVER do this with private conversations.
So, if you are experiencing reservations about sharing stories with me, or possibly sponsoring some storytelling on API Evangelist because you are worried about what will happen, stop fretting. If you are upfront with me, clear about what is on the record, and what is off, and honest about what you are looking to get out of the relationship, things will be fine. Even if they end up being rocky, I’m not the kind of person to call you out on the blog. I may complain, rant, and vent, but you can look through seven years of the blog and you won’t find me doing that about anyone I’ve specifically worked with on storytelling projects. I don’t always agree with why corporations, institutions, and government agencies are so controlling of the message around their API operations, but I will be respectful of any line you draw for me.
I’ve been struggling to get the latest edition of my industry guides out the door. I have a new Adobe Indesign format which I really like as a constraint, but is also pushing my desktop publishing skills. What is really kicking my ass though, is the editing. This latest copy was professionally edited, but I ran out of money to pay him on future guides, and I ended up making some slight changes to this one as well. I am very self-conscious of my grammar and spelling mistakes. I’m capable of editing my own stuff, and my grammar and spelling is high quality. The problem is that I’m too close to the content, and with each edit I make changes, which then introduce new mistakes. Also my brain moves too fast sometimes, and I just make silly mistakes, and overlook things by just reading it the way my brain intended.
Anyways, I’m over stressing on it all. I just want to get my guides out. I have too much of a back log, and since I can afford a professional editor to shadow my work, I’m just going to put them out there. If you find mistakes, feel free to submit a Github issue on the repo for my API design research. I have too many guides to get out, and it is more important to me that my research moves forward, I spend the time distilling things down into a guide, and hitting publish. I can’t wait for perfect. If folks discount my work because I’m moving so fast, too bad. It is more important that the knowledge is in my head. If you want to help fund me so I can properly afford an editor, I welcome that as well–I have one who will work with me full time, I just need the cash! Anyways, I’m finally getting around to publishing this edition of the API design industry guide, which I hope provides a snapshot of the space.
My API Evangelist API Design Industry Guide is not meant for the API echo chamber. It is meant for executives, business folks, IT, and developers who are looking to do APIs outside of the mainstream tech community. My goal isn’t to cover in detail every aspect of API design. My goal is to cover the industry of API design, while focusing on the highlights of each working area. I track on the service providers who deliver solutions in the API design space, as well as some of the open source tooling that is available. Then I try to look at some of the common building blocks of APIs design, with an emphasis on REST and hypermedia. I also inject a handful of one and two page articles in there covering a variety of topics, as well as how the world of API design is shifting with the introduction of new approaches like gRPC and GraphQL. My definition of API design is not dogmatically REST. It is about pragmatically stepping back from API development and thinking about the best patterns available to us in the space.
Once I get all of my core research area published in this new format, I will work to update them more regularly, and try to keep them rolling forward with new versions. If you would like to sponsor one, or invest in one of the other 85+ areas of my API industry research, feel free to reach out. Thanks for your patience while I found the mojo to work on these again, and your help in identifying any errors or mistakes I’ve made. Also, take notice my new approach to making these available, where you can always download them for free, or you can purchase the latest copy using Gumroad for a small fee. This helps support my work, and you’ll be added to the mailing list, and automatically get a free copy when I update in the future. I’m moving forward to work on my API deployment, and management guides, as well as some sponsored guides in the areas of data, database, and the trend of fake news, accounts, bots, and more.
Thanks again for your support!
I’m a member of the OpenAPI Iniative (OAI). I’m not very active on the governance or marketing, but I enjoy hanging out in the hallways of the Slack channel, and being part of the conversation. I’m pretty confident in the core group’s ability to steer the direction of the specification, and leave my influence to be more about storytelling externally, and planting seeds in the minds of folks who are putting the API specification to use. I have a much different style to influencing the API space than many of the companies I share membership within the OAI–it is just my way.
I am working with more groups to help them craft, maintain, and evangelize around a specific OpenAPI definition, for use in a specific industry. THe primary one on the table for me is the Human Services Data API (HSDA). Which is an OpenAPI for helping cities, municipalities, and non-profit organizations that help deliver information around human services, speak a common language. This is just one example of industry specific API definitions emerging. I am seeing OpenAPI emerge for PSD2, FHIR, helping guide the conversation going on in the financial and healthcare sectors.
The OpenAPI as a top level API specification standard is maturing, and is something that reached version 3.0, and once the services and tooling catch up, we’ll see another boom in industry specific API definitions emerge. This is when we are going to see the need to start harmonizing, standardizing, and merging many disparate standards into a single specification, or at least interoperable specifications. You see this happening right now with OpenAPI, API Blueprint, and RAML–they are all part of the OpenAPI Initiative (OAI). In the next five years you will see this same thing begin occurring for other industry specific APIs, and we’ll eventually need governing bodies to help move forward these independent efforts, as well as feed needs back up the supply chain to OpenAPI.
Maybe not right away, but eventually the OpenAPI will need to start thinking about how it establishes separate industry working groups dedicated to specific implementations of OpenAPI. I can see a healthcare, banking, education, transportation, messaging, and other specific industries emerge, needing a more stabilized spec, and formal group to help drive forward incarnations of the spec. It’s taking me awhile to get the HSDA working group to be OpenAPI literate, but I’m already seeing the speed picking up, as different members learn to contribute to the HSDA OpenAPI, and understanding it is a central truth for code, documentation, testing, and for everything we are doing as part of the working group.
Just some food for thought. Like I said, we are a long way off from needing this, but I can already see it on the horizon. I’d just like to plant the seed with the OAI, as well as with folks out there who are pushing forward a specific OpenAPI within an industry. As you look to formalize what you are doing you might want to join the OAI, and participate in some of the conversation going on at the higher level. Maybe come to APIStrat in Portland this November, and bring your OPenAPI discussions with you. APIStrat has long been the place where we hammer out API definition conversations, so it make sense to keep going, especially now that it is an official OpenAPI (OAI) conference. If you have any questions about what I’m doing with HSDA, and the future of industry specific API definitions, feel free to reach out.
I was taking a fresh look at my real time API research as part of some data streaming, and event sourcing conversations I was having last week. My research areas are never perfect, but I’d say that real time is still the best umbrella to think about some of the shifts we are seeing on the landscape recently. They are nothing new, but there has been renewed energy, new and interesting conversation going on, as well as some growing trends that I cannot ignore. To support my research, I took a day this week to dive in, have a conversation with my buddy Alex over at the TheNewStack.io, and the new CEO of WSO2 Tyler Jewell around what is happening.
The way I approach my research is to always step back and look at what is happening already in the space, and I wanted to take another look at some of the real time API service providers I was already keeping eye on in the space:
- Pubnub - APIs for developers building secure realtime Mobile, Web, and IoT Apps.
- StreamData - Transform any API into a real-time data stream without a single line of server code.
- Fanout.io - Fanout’s reverse proxy helps you push data to connected devices instantly.
- Firebase - Store and sync data with our NoSQL cloud database. Data is synced across all clients in real time, and remains available when your app goes offline.
- Pusher - Leaders in real time technologies. We empower all developers to create live features for web and mobile apps with our simple hosted API.
I’ve been tracking on what these providers have been doing for a while. They’ve all been pushing to boundaries of what is streaming, and real time APIs for some time. Another open source solution that I think is worth noting, which I believe some of the above services have leverages is Netty.io.
- Netty - Netty is an asynchronous event-driven network application framework for rapid development of maintainable high performance protocol servers & clients.
I also wanted to make sure and include Google’s approach to a technology that has been around a while:
- Google Cloud Pub/Sub - Google Cloud Pub/Sub is a fully-managed real-time messaging service that allows you to send and receive messages between independent applications.
Next, I wanted to refresh my understanding of all the Apache projects that speak to this realm. I’m always trying to keep a handle on what they each actually offer, and how they overlap. So, seeing them side by side like this helps me think about how they fit into the big picture.
- Apache Kafka - Kafka™ is used for building real-time data pipelines and streaming apps. It is horizontally scalable, fault-tolerant, wicked fast, and runs in production in thousands of companies.
- Apache Flink - Apache Flink® is an open-source stream processing framework for distributed, high-performing, always-available, and accurate data streaming applications.
- Apache Spark - Spark Streaming makes it easy to build scalable fault-tolerant streaming applications. Spark Streaming is an extension of the core Spark API that enables scalable, high-throughput, fault-tolerant stream processing of live data streams.
- Apache Storm Apache Storm is a free and open source distributed realtime computation system. Storm makes it easy to reliably process unbounded streams of data, doing for realtime processing what Hadoop did for batch processing.
- Apache Apollo - ActiveMQ Apollo is a faster, more reliable, easier to maintain messaging broker built from the foundations of the original ActiveMQ.
One thing I think is worth noting with all of these is the absence of the web when you read through their APIs. Apollo had some significant RESTful approaches, and you find gateways and plugins for some of the others, but when you consider how these technologies fit into the wider API picture, I’d say they aren’t about embracing the web.
On that note, I think it is worth mentioning what is going on over at Google, with their gRPC effort, which provides “bi-directional streaming and fully integrated pluggable authentication with http/2 based transport”:
- gRPC - A high performance, open-source universal RPC framework
Also, I think most notably, they are continuing the tradition of APIs embracing the web, and built on top of HTTP/2. For me, this is always important, and trumps just being open source in my book. The more web an open source technology, and a company’s service utilize, the more comfortable I’m going to feel telling my readers they should be baking this into their operations.
After these services and tooling, I don’t want to forget about the good ol fashioned protocols available out there, that help use doing things in real time. I’m tracking on 12 real time protocols that I see in use across the companies, organizations, institutions, and government agencies I’m tracking on:
- Simple (or Streaming) Text Orientated Messaging Protocol (STOMP) - STOMP is the Simple (or Streaming) Text Orientated Messaging Protocol. STOMP provides an interoperable wire format so that STOMP clients can communicate with any STOMP message broker to provide easy and widespread messaging interoperability among many languages, platforms and brokers.
- Advanced Message Queuing Protocol (AMQP) - The Advanced Message Queuing Protocol (AMQP) is an open standard for passing business messages between applications or organizations. It connects systems, feeds business processes with the information they need and reliably transmits onward the instructions that achieve their goals.
- MQTT - MQTT is a machine-to-machine (M2M)/Internet of Things connectivity protocol. It was designed as an extremely lightweight publish/subscribe messaging transport. It is useful for connections with remote locations where a small code footprint is required and/or network bandwidth is at a premium.
- OpenWire - OpenWire is our cross language Wire Protocol to allow native access to ActiveMQ from a number of different languages and platforms. The Java OpenWire transport is the default transport in ActiveMQ 4.x or later.
- Websockets - WebSocket is a protocol providing full-duplex communication channels over a single TCP connection. The WebSocket protocol was standardized by the IETF as RFC 6455 in 2011, and the WebSocket API in Web IDL is being standardized by the W3C.
- Extensible Messaging and Presence Protocol (XMPP) - XMPP is the Extensible Messaging and Presence Protocol, a set of open technologies for instant messaging, presence, multi-party chat, voice and video calls, collaboration, lightweight middleware, content syndication, and generalized routing of XML data.
- PubSubHubbub - PubSubHubbub is an open protocol for distributed publish/subscribe communication on the Internet. Initially designed to extend the Atom (and RSS) protocols for data feeds, the protocol can be applied to any data type (e.g. HTML, text, pictures, audio, video) as long as it is accessible via HTTP. Its main purpose is to provide real-time notifications of changes, which improves upon the typical situation where a client periodically polls the feed server at some arbitrary interval. In this way, PubSubHubbub provides pushed HTTP notifications without requiring clients to spend resources on polling for changes.
- Real Time Streaming Protocol (RTSP) - The Real Time Streaming Protocol (RTSP) is a network control protocol designed for use in entertainment and communications systems to control streaming media servers. The protocol is used for establishing and controlling media sessions between end points. Clients of media servers issue VCR-style commands, such as play and pause, to facilitate real-time control of playback of media files from the server.
- Server-Sent Events - Server-sent events (SSE) is a technology where a browser receives automatic updates from a server via HTTP connection. The Server-Sent Events EventSource API is standardized as part of HTML5 by the W3C.
- HTTP Live Streaming (HLS) - HTTP Live Streaming (also known as HLS) is an HTTP-based media streaming communications protocol implemented by Apple Inc. as part of its QuickTime, Safari, OS X, and iOS software.
- HTTP Long Polling - HTTP long polling, where the client polls the server requesting new information. The server holds the request open until new data is available. Once available, the server responds and sends the new information. When the client receives the new information, it immediately sends another request, and the operation is repeated. This effectively emulates a server push feature.
These protocols are used by the majority of the service providers and tooling I list above, but in my research I’m always trying to focus on not just the services and tooling, but the actual open standards that they support.
I have to also mention the entry level aspect of real time in my opinion. Something, that many API providers support, but also is the 101 level approach that some companies, organizations, institutions, and agencies need to be exposed to before they get overwhelmed with other approaches.
- Webhooks - A webhook in web development is a method of augmenting or altering the behavior of a web page, or web application, with custom callbacks. These callbacks may be maintained, modified, and managed by third-party users and developers who may not necessarily be affiliated with the originating website or application.
That is the real time API landscape. Sure, there are other services, and tooling, but this is the cream on top. I’m also struggling with the overlap with event sourcing, evented architecture, messaging, and other layers of the API space that are being used to move bits and bytes around today. Technologists aren’t always the best at using precise words, or keeping things simple, and easy to understand, let alone articulate. This is one of the concerns I have with streaming API approaches, is that they are often over the heads, and beyond the needs of some API providers, and may API consumers. They have their place within certain use cases, and large organizations that have the resources, but I spend a lot of time worrying about the little guy.
I think a good example of web API vs streaming API can be found in the Twitter API community. Many folks just need simple, intuitive, RESTful endpoints to get access to data, and content. While a much smaller slice of the pie will have the technology, skills, and compute capacity to do things at scale. Regardless, I see technologies like Apache Kafka being turned into plug and play, infrastructure as a service approaches, allowing anyone to quickly deploy to Heroku, and just put to work via a SaaS model. So, of course, I will still be paying attention, and trying to make sense out of all of this. I don’t know where any one it will be going, but I will keep tuning in, and telling stories about how real time, and streaming API technology is being used, or not being used.
I have been having more conversations with federal agencies as part of my work with my Skylight partners about API related microconsulting. One recent conversation, which I won’t mention the agency, because I haven’t gotten approval, involved bug bounties on top of an API they are rolling out. The agency isn’t looking for the regular technology procurement lifecycle around this project, they are just looking for a little bit of research and consulting to help ensure they are on the right track when it comes to hardening their API approach.
Micro consulting like this will usually not exceed $5,000.00 USD, and will always be a short term commitment. From my vantage point micro consulting will always be API related, and in this particular case involves studying how other API providers in the private sector are leveraging bug bounties to help harden their APIs either before they go public, or afterwards in an ongoing fashion. After I do the research I will be taking this work back to my team of consultants at Skylight, and we’ll put together formal report and presentation that we will bring back to the federal government agency to put into motion.
This approach to doing APIs in the federal government (or any government) is a win-win. It fits with my approach to doing research at API Evangelist, and it provides API expertise for federal agencies in small, affordable, and bite-size chunks. Government agencies do not have to wait months, or years, and spend massive amounts of money to gain access to API expertise. For Skylight, it gets our foot in the door within government, and helps demonstrate the expertise we bring to the table. Something that will almost always turn into additional micro procurement relationships, as well as potentially larger scale, ongoing project relationships.
Personally, I like my API consulting just like my APIs, small, and doing one thing well. I don’t like consulting contracts that try to do too much. I also like getting paid in chunks that can usually be put on the corporate credit card, and avoid too much purchase order, vendor system, 30, 60, and 90 day wrangling–or worse, not getting paid at all. You’ll hear me beating the micro consulting, and micro procurement drum a lot more in coming months. I’m going to be working to educate more government agencies, and even the enterprise of the potential when it comes to API related content creation, storytelling, training, and research. I’m predicting it will have the same effect as APIs are having on how companies, organizations, institutions, and government agencies are doing business in the digital economy.
I’m regularly working to make APIs more accessible to non-developers, and Zapier is the #1 way I do this. Zapier provides ready-to-go API integration recipes for over 750 APIs, providing IFTTT-like functionality, but in a way that actual pays the whole API thing forward (Zapier has APIs, IFTTT does not). One of the benefits of having APIs is you can build embeddable tooling on top of them, and Zapier has some basic embeddable tools available to anyone, with some more advanced options for partners via their partner API.
Using the Zapier basic embeddable widget you can list one or many Zaps, providing recipes for any user to integrate with one or many APIs, that can be embedded into a web page, or within an application:
All you do is add the id for each of the Zaps you wish to list, under the “guided_zaps” variable, and it will list the icon, title, and “use this zap” functionality, all wrapped in the appropriate powered by Zapier branding. I’m developing lists of useful Zaps ranging from working with Google Sheets, to managing social media presence on Twitter and Facebook. Everyday, useful things that the average user might find valuable when it comes to automating, and taking control over their online presence. Anytime I reference possible API integration use cases in a story, I’m going to start embedding a widget of actual Zaps you can use to accomplish whatever I’m talking about below.
I know that catering to the enterprise is where the money is at. I know that playing with all the cool new containerized, event sourcing, continuously integrated and deployed solutions are where you can prove you know your stuff. However, in my world I come across so many companies, organizations, and government agencies that just need things to work. They don’t have the skills, resources, or time to play with everything cool, and really could just use some better access to their data and content across their business, with trusted partners, and maybe solicit the help of 3rd party developers to help carry the load.
Many of the conversation I am having within startup and tech circles often focus on scale, and the latest tech. I get that this is the way things work in alpha tech circles, and this is applicable in your worlds, always moving forward, pushing the scope of what we are doing, and making sure we are playing with the latest tools is how it’s done. However, not everyone has this luxury, and many companies can’t afford to hire the talent needed, or pay the cost associated with doing things the most modern approach, or even the right way. Remember, when you are talking about Kafka, Kuburnetes, Docker, GraphQL, and other leading edge solutions, you are talking from a place of privilege. Meaning you probably have the time, resources, and space to implement the most modern approach, and have the team to do it right.
I’m not trying to stop you from having fun, and doing what you do. I am just trying to share what I’m seeing on the ground at companies, organizations, and government agencies I’m talking to. I’m spending a lot of time trying to help get folks up to speed on everything I’m seeing, and many of them are intimidated by the pace at which things move, the scope of implementations they are reading about across tech blogs, and insecure about what they don’t know. I’m finding that I’m helping folks think through their basic usage of the cloud, over containerization, and helping them understand basic APIs and webhooks, over evented architecture and modern approaches to messaging. They just aren’t ready for much of what I’m reading and tracking on in my monitoring of the API space.
I guess I’m just asking for some help. If you are writing about tech, maybe reach out to small businesses and organizations outside the tech bubble and ask them what their challenges are, over writing about yet another startup, or cool new tech. Maybe take a portion of that investment you just got and establish some grants for nonprofits, students, or others to spend some time learning about new technology (not yours), and pay a contractor to help them solve a single, small problem–enjoying a little micro procurement assist, on your dime. Maybe rather than disrupting an industry, you could reach out to some smaller companies in the space and mentor them? IDK. I’m just spitballin here. I just feel like the gap is widening when it comes to technology on the ground in your average city, over what I see in the Bay Area, and other major tech hubs. If we don’t work to close this gap, it is going to bite us all in the ass at some point, if it already isn’t.
A regularly question I get from business folks out in the space, is regarding where they should start with APIs. My world is usually broke into two areas: 1) Providing APIs, and 2) Consuming APIs. I’d say that these business folk I keep coming across could easily span both of these areas, making it significantly more complicated to help them understand where they should be getting started with APIs. With the API landscape being so wide, and APIs becoming so ubiquitous across many industries, helping someone onboard to the concept can get pretty complex and confusing pretty quick.
I always try to prime the pump with my API 101 material, and encourage folks to learn about the history of APIs. I find that before you begin getting bombarded with the technical details of APIs it helps to get the lay of the land, and understand what is going on at the highest level, developing a better understanding of how we got here. Before you get working with any single API, you should try to understand why it has begun to be such a big part of everything we know of online today, and via our mobile phones. Most people don’t realize that they are using APIs everyday, as part of their regular business activity, and common things they do in their personal lives–things like buy products from Amazon, and sharing updates with friends on Facebook.
Next, I recommend looking at the software you use each day in your work. If you are an architect, look a the CAD software you are using. If you are in healthcare, look at the administrative systems you use, and the devices you put to work. If you are retail, look at the point of sale (POS), and payment systems you use. All of these companies have APIs in one form or another. They might not all be public APIs like Facebook, Twitter, and Google, but they have APIs that might make a lot more sense to you in your world. You should be learning about the companies behind this software, searching their websites, knowledge-bases, and other systems for APIs, and understanding what they do, and how you can possibly put them to use in your world, and help you do what you do better.
Take a look at your smart phone. All those icons for the applications you use have APIs behind them. Mobile phones are why APIs have become such a big thing. It is how information is sent and received by mobile applications. If you want to learn about APIs, you should start by learning in the context of the applications, services, and tooling you are already depending on and putting to use. A great example of this for business users can be found with the spreadsheet. Both Google and Microsoft have APIs, allowing you to get data into, and out of a spreadsheet. I often feel like the spreadsheet API is one of the most important, and underrated APIs out there, and spreadsheets are one of the most important API clients out there as well, but often gets overlooked by developers who don’t love the spreadsheet as much as business users.
As a professional in any industry, I recommend starting here. The chances you will discover and learn about APIs in a more meaningful and impactful way is greater, than if you just take some learn to code, or other technical approach. I’m going to begin building this into my API lessons and training work I’ve kicked off recently, and see if I can develop more material that is less API provider vs API consumer, and more about helping folks in specific industries, or who might be using specific software, platforms, services, and tools. Regardless, if you are looking to jumpstart your own API learning I recommend just looking around you, I think you’d be surprised how many APIs there are behind the solutions you are already putting to use.
I was talking to my friends TC2027 Computer and Information Security class at Tec de Monterrey via a Google hangout today, and one of the questions I got was around managing API sessions using JWT, which was spawned from a story about security JWT. A student was curious about managing session across API consumption, while addressing securing concerns, making sure tokens aren’t abused, and there isn’t API consumption from 3rd parties who shouldn’t have access going unnoticed.
I feel like there are two important, and often competing interests occurring here. We want to secure our API resources, making sure data isn’t leaked, and prevent breaches. We want to make sure we know who is accessing resources, and develop a heightened awareness regarding who is accessing what, and how they are putting them to use. However, the more we march down the road of managing session, logging, analyzing, tracking, and securing our APIs, we are also simultaneously ramping up the surveillance of our platforms, and the web, mobile, network, and device clients who are putting our resources to use. Sure, we want to secure things, but we also want to think about the opportunity for abuse, as we are working to manage abuse on our platforms.
To answer the question around how to track sessions across API operations I recommended thinking about that identification layer, which includes JWT and OAuth, depending on the situation. After that you should be looking other dimensions for identifying session like IP address, timestamps, user agent, and any other identifying characteristics. An app or user token is much more about identification, than it ever provides actual security, and to truly identify a valid session you should have more than one dimension beyond that key to acknowledge valid sessions, as well as just session in general. Identifying what healthy sessions look like, as well as unhealthy, or unique sessions that might be out of the realm of normal operations.
To accomplish all of this, I recommend implementing a modern API management solution, but also pulling in logging from all other layers including DNS, web server, database, and any other system in the stack. To be able to truly identify healthy and unhealthy sessions you need visibility, and synchronicity across all logging layers of the API stack. Does the API management logs reflect DNS, and web server, etc. This is where access tiers, rate limits, and overall consumption awareness really comes in, and having the right tools to lock things down, freeze keys and tokens, as well as being able to identify what healthy API consumption looks like, providing a blueprint for what API sessions should, or shouldn’t be occurring.
At this point in the conversation I also like to point out that we should be stopping and considering at what point all of this API authentication, security, logging, analysis, and reporting and session management becomes surveillance. Are we seeking API security because it is what we need, or just because it is what we do. I know we are defensive about our resources, and we should be going the distance to keep data private and secure, but at some point by collecting more data, and establishing more logging streams, we actually begin to work against ourselves. I’m not saying it isn’t worth it in some cases, I am just saying that we should be questioning our own motivations, and the potential for introducing more abuse, as we police, surveil, and secure our APIs from abuse.
As technologists, we aren’t always the best at stepping back from our work, and making sure we aren’t introducing new problems alongside our solutions. This is why I have my API surveillance research, alongside my API authentication, security, logging, and other management research. We tend to get excited about, and hyper focused on the tech for tech’s sake. The irony of this situation is that we can also introduce exploitation and abuse around our practices for addressing exploitation and abuse around our APIs. Let’s definitely keep having conversations around how we authenticate, secure, and log to make sure things are locked down, but let’s also make sure we are having sensible discussions around how we are surveilling our API consumers, and end users along the way.
API management was the first area of my research I started tracking on in 2010, and has been the seed for the 85+ areas of the API lifecycle I’m tracking on in 2017. It was a necessary vehicle for the API sector to move more mainstream, but in 2017 I’m feeling the concept is just too large, and the business of APIs has evolved enough that we should be focusing in on each aspect of API management on its own, and retire the concept entirely. I feel like at this point it will continue to confuse, and be abused, and that we can get more precise in what we are trying to accomplish, and better serve our customers along the way.
The main concepts of API management at play have historically been about authentication, service composition, logging, analytics, and billing. There are plenty of other elements that have often been lumped in there like portal, documentation, support, and other aspects, but securing, tracking, and generating revenue from a variety of APIs, and consumers has been center stage. I’d say that some of the positive aspects of the maturing and evolution of API manage include more of a focus on authentication, as well as the awareness introduced by logging and analytics. I’d say some areas that worry me is that security discussions often stop with API management, and we don’t seem to be having evolved conversations around service conversation, billing, and monetization of our API resources. You rarely see these things discussed when we talk about GraphQL, gRPC, evented architecture, data streaming, and other hot topics in the API sector.
I feel like the technology of APIs conversations have outpaced the business of APIs conversations as API management matured and moved forward. Advancements in logging, analytics, and reporting have definitely advanced, but understanding the value generated by providing different services to different consumers, seeing the cost associated with operations, and the value generated, then charging or even paying consumers involved in that value generation in real-time, seems to be being lost. We are getting better and the tech of making our digital bits more accessible, and moving them around, but we seemed to be losing the thread about quantifying the value, and associating revenue with it in real-time. I see this aspect of API management still occurring, I’m just not seeing the conversations around it move forward as fast as the other areas of API management.
API monetization and plans are two separate area of my research, and are something I’ll keep talking about. Alongside authentication, logging, analysis, and security. I think the reason we don’t hear more stories about API service composition and monetization is that a) companies see this as their secret sauce, and b) there aren’t service providers delivering in these areas exclusively, adding to the conversation. How to rate limit, craft API plans, set pricing at the service and tier levels are some of the most common questions I get. Partly because there isn’t enough conversation and resources to help people navigate, but also because there is insecurity, and skewed views of intellectual property and secret sauce. People in the API sector suck at sharing anything they view is their secret sauce, and with no service providers dedicated to API monetization, nobody is pumping the story machine (beyond me).
I’m feeling like I might be winding down my focus on API management, and focus in on the specific aspects of API management. I’ve been working on my API management guide over the summer, but I’m thinking I’ll abandon it. I might just focus on the specific aspects of conducting API management. IDK. Maybe I’ll still provide a 100K view for people, while introducing separate, much deeper looks at the elements that make up API management. I still have to worry about onboarding the folks who haven’t been around in the sector for the last ten years, and help them learn everything we all have learned along the way. I’m just feeling like the concept is a little dated, and is something that can start working against us in some of the conversations we are having about our API operations, where some important elements like security, and monetization can fall through the cracks.
I am always surprised at the folks who I meet for the first time who automatically assume I’m all about the REST. It is always something that is more telling about the way they see the world (or don’t), than it ever is about me as THE API Evangelist. It is easy to think I’m going to get all RESTY, and start quoting Roy, but I’m no card carrying RESTafarian, like my buddy Darrel Miller (@darrel_miller) (not that is what Darrel does ;-). Really the only thing I get passionate about is making sure we are reusing the web, and I am pretty much be a sellout on almost everything else.
I am just looking to understand how folks are exposing interfaces for their digital resources using the web, making them available for use in other applications. I feel like RESTful approaches are always a good start for folks to begin considering, and learning from when beginning their journey, but I’m rarely going to get all dogmatic about REST. There are trade-offs with any approach you take to providing programmatic interfaces using the web, and you should understand what these are whether your are using REST, Hypermedia, (g)RPC, GraphQL, or any other number of protocols and technologies available out there. A RESTful approach using the web just tends to be the lowest common denominator, the cheapest, and widest reaching solution we have on the table. Rarely is it ever the perfect solution–there are no such things. #sorry
If you are entering into discussions with me thinking I’m 100% team REST, you are mistaken, and you have profiled yourself considerably for me. It shows me that you haven’t done a lot of (wide) reading on the subject of APIs, and while you may be an expert, you probably are a very siloed expert who doesn’t entertain a lot of outside opinions, and keep an eye on how the space is shifting and changing. When I encounter folks like you in the space you’ll often find me pretty quiet, submissive, and just nodding my head a lot. As you aren’t my target audience, and there isn’t much I can say that will shift your world view. Your opinions are pretty set, and I’m not going to be the one who moves them forward. My role is to reach folks are looking for answers, not those who already have them.
Darrel Miller has a thought provoking post on OpenAPI not being what he thought, shining a light on a very important dimension of what OpenAPI does, and doesn’t do in the API space. In my experience, OpenAPI is rarely what people think, and I want to revisit once slice of Darrel’s story, in regards to folks generally thinking OpenAPI (Swagger) as being all about API documentation. In 2017, the majority of folks I talk to think OpenAPI is about documenting your APIs–something that always makes me sad, but I get it, and is something I regularly work to combat this notion.
First, and foremost, OpenAPI is a bridge to understanding and being able to communicate around using HTTP as a transport, and our greatest hope for helping developers learn their HTTPs and 123s. I meet developers on a regular basis who are building web APIs, yet do not have a firm grasp on what HTTP is. Hell, I’ve had a career dedicated to web APIs for the last seven years, and I’m still developing my grasp on what it is, learning new things from folks like Erik Wilde (@dret), Darrel Miller (@darrel_miller), and Mike Amundsen (@mamund) on a regular basis. In the API game, you should always be learning, and the web is the center of your existence at the moment as a software engineer, and should be the focus of what you are learning about to push forward your knowledge.
Darrel has a great line in his post where he has “a higher chance of convincing developers to stop drinking Mountain Dew than to pry a documentation generator from the hands of a dev with a deadline.” Meaning, most developers don’t have the time or interest to learn about what OpenAPIs, or can do for them in their busy world, they just want the help delivering documentation–a very visual representation of the work they’ve done, and is something they can demonstrate to their boss, partners, and customers. Most developers aren’t spending the time trying to know and understand everything API, thinking deeply on the subject like Darrel and I are doing. Most don’t even have time to read our blog posts. A sad fact of doing business in the tech space, but is something us in charge of API standards and tooling, or even selling API services should be aware of.
You see an essence of this with API code generators, and API testing from OpenAPI. Although in much lesser quantities than API documentation enjoys. Developers just want the assist, they really don’t care whether it is the right way of doing things, or the wrong way, and how it fits into the bigger picture. API developers just want to get their work done, and move on. It is up to us analysts, standards shepherds, and API service providers to help educate, illuminate, and incentivize developers to get over their limiting views on what OpenAPI is and/or develop the next killer tooling that helps make their lives insanely easier like Swagger UI did for API documentation. We need to learn from the impact this tooling has made, and make sure the other lifecycle solutions we are delivering speak in similar tones.
If you are reading this piece, and are still in the camp of folks who still see OpenAPI as Swagger UI, don’t feel bad, it is a common misconception, and one that was exacerbated by the move from Swagger to OpenAPI. My recommendation is that you begin to look at OpenAPI independent of any tooling it enables. Think of it as a checklist for your HTTP learning, sharing, and communication across your API development team. It shouldn’t be just about delivering documentation, code, tests, or anything else. OpenAPI is about making sure you have the HTTP details of your API delivered in a consistent way, across not just a single APIs, but all the APIs you are delivering. OpenAPI is the bridge to where you are now with your API operations, to where you should be when it comes to the definition, design, deployment, management, and delivering sustainable contracts around the digital assets you are serving up internally, with partners, and 3rd party developers. It may see like extra work to think about it this way, but it is something that will save you time and money down the road.
I am picking up some of my past work, so that I can move forward in a new way. A while ago, I began working on my subway map API to help me articulate aspects of the API lifecycle, and provide a “vehicle” for helping folks explore some often complex API concepts, in a way that would incrementally introduce them to new ideas. I used the subway map as an analogy because it has been historically used to help folks understand complex systems, and help them navigate it, even if they don’t fully understand everything about it. I gave a talk at @APIStrat in Austin, TX on this subject, but something I haven’t moved forward in over a year.
For my proof of concept I published 15 “stops” along the request “line” for my API design “area”. I don’t have the visual elements present for this functionality, as I just wanted to prove that I could use Liquid and HTML for a hypermedia client, using Siren YAML published to Github. I was forced to add a layout: property to my Siren schema, which is probably heresy to couple to the client in this way, but it is something I’m willing to take a hit for. Everything else is pure Siren. While there is still a lot more work to be done, I was able to expand the boundaries of how I use hypermedia and Jekyll, in a single proof of concept–telling me the idea is worth moving forward with.
To make things work I published a set of Siren hypermedia YAML documents (I know Kevin, I’m making you cringe, but bear with me) to a Jekyll collection called _design. Then I have three Jekyll client templates in the _layouts folder, called area, lines, and stops. My client isn’t that sophisticated for this proof of concept, but I am able to easily work with the entities, properties, and links in Liquid effectively. I’m just wanting to show that I can take my YAML to the next level, and expand my link relations beyond just next and previous that is often associated with Jekyll _posts, opening up the entire IANA link relations catalog of options, plus anything custom I will need (ie. area, line, stop). It doesn’t look like much, but it provides a pretty compelling example of using Jekyll and Github to deliver complex content that will be changing regularly in a static way.
The viewable side of my hypermedia Jekyll subway map API client doesn’t look that attractive yet, but it provides the basic next/previous functionality, as well links back to the area, or line of coverage. This is just the first two types of experience I’m looking to provide as people explore. I will be introducing transfers, help, and other supporting link relations between the content being made available. Eventually the home page of these projects will be a subway map with accompanying key, and then you choose the area, explore lines, and get off on any stop you desire. It’s just a start, but I feel my Jekyll hypermedia client proof of concept is a success, and I’ll get to work on publishing more content, and adding the visual elements that make it truly a subway map experience.
I am spending two days this week with the Capital One DevExchange team outside of Washington DC, and they’ve provided me with a list of questions for one of our sessions, which they will be recording for internal use. To prepare, I wanted to work through my thoughts, and make sure each of these answers were on the tip of my tongue–here is one of those questions, along with my thoughts.
I’m not a big fan of predictions. It is a game that analysts and investors play to try and shape the world they want to see. I’m usually focused on shaping the world I want to see by understanding where we are, how we got here, and making incremental shifts in our behavior today. I tend to think that technology futurists are more about ignoring the past, and being in denial about today, than they are ever about what is happening in the future. However, with that said, let me share some thoughts about what I think the future will hold when it comes to doing APIs.
Mostly, by 2024 things will look much like they do today. Nothing moves as fast we think they will in the tech sector. I’ve done a lot of studying the history of predictions by technology blogs, and analysts firms over the last decade about what they thought 2017 would be like, and it is 98% bullshit. It was more about getting what they wanted in the year they made the prediction, than it was ever about the future. Technology always feels like it is moving faster than ever before but honestly, what has happened in the last decade that is really seismic? I’d say iPhone is the biggest, with mostly everything else being pretty incremental, and slow moving.
By 2024, we will still be struggling with technical debt. However, the debt limit ceiling will have been raised significantly. We won’t have decoupled the monolith, all while adding to it. The web will still be preferred approach to delivering APIs, although there will be numerous add-ons, and proprietary bastardizations surgically attached to it. REST APIs will be relic of the past, but web APIs will still be the preferred approach because it is simple, however there will be multiple speed API approaches reflecting what we are seeing from gRPC. Think about how different the web APIs are today, from the web APIs in 2010–there really isn’t that many changes. They still suck as bad as they did back then, with a few well designed ones available–just like there was in 2010.
There will be significant movements in the world of API definitions and schema. OpenAPI will have opened up a flood of competing specifications, as well as industry specific implementations using those formats. With the mainstream adoption of web APis, the need for common schema, dictionaries, and design patterns will increase. A significant portion of these API definitions will follow the web, reusing existing patterns, and evolving them in meaningful ways. The rest will be proprietary, complex, and not actually do many API providers much good, yet they will be used by popular APIs, so they will end up being baked into systems and applications. There will never be a single definition to rule them all, although we will see some strong media types emerge that encourage doing APIs in a powerful way at scale.
Much of the API sector will be deployed on the backs of cloud giants. The web landscape will be so volatile and un-secure, small business will not be able to operate without the assistance of platforms with the security expertise to defend our APIs. Cyberwarfare will be normalized, and corporations and state actors will routinely disrupt, infiltrate, and make doing business online near impossible. If you want to stay in business as an API provider you will have to pay a security tax to a cloud enforcer to ensure your services stay up. This will continue to give a unhealthy amount of power to large companies to dictate which types of APIs should be available, which new ideas get created, and shaping the overall notion of what is APIs–kind of the gilded age, after the wild west.
In 2024, algorithmic APIs will outpace strictly data or content APIs, with artificial intelligence, and machine learning dominating the landscape. This shift will mostly cause noise, disturbances, and fuel cyber(in)security, but there will be incremental evolution in these disciplines that actually deliver real business value. There will be enough value created from AI, ML, and algorithmic APIs to keep investment going, but most of it will be hype, over promised, and just confusing to users, continuing to give APIs a bad time. Luckily APIs will also be used to help make these black box algorithms more observable, accountable, and regulated, although may will still remain out of sight due to intellectual property, complexity, and other forms of technological theater.
Another aspect the API space that will continue to move front and center is event-driven architecture. As API move into the mainstream, much of the design thinking that goes into them will be heavily centered around business activities, and the events that matter to humans (and companies). We will see event sourcing, webhooks, and real-time technologies continue to evolve and grow, and become baked into our API approaches, much like REST design patterns are today. While it will continue to grow and evolve, the event landscape will be littered with technologies that confuse, complex, and go against much of what we’ve built over the years with simple API design approaches, but this won’t stop folks from thinking it is a good idea.
APIs in 2024 won’t be the shiny, API saved world we envision. It will still be pretty messy, and many will question whether we should be doing APIs in the first place, but doing business on the web will require moving data, and content around at scale, leveraging open and commercial algorithms to get things done. So APIs will persist, but not necessary thrive. They will be ubiquitous, and rarely exciting. Kind of like a gas station or convenience store. You will use them daily, they will be everywhere, but rarely will they bring inspiration as they have in the early days. One side effect of this will be that all government will be doing APis, but sadly it won’t bring the agility, and nimbleness we’ve all promised. It will usually just make things harder, and open public resources up for exploitation by commercial vendors who are offering the true solution for a fee in the parking lot.
I am spending two days this week with the Capital One DevExchange team outside of Washington DC, and they’ve provided me with a list of questions for one of our sessions, which they will be recording for internal use. To prepare, I wanted to work through my thoughts, and make sure each of these answers were on the tip of my tongue–here is one of those questions, along with my thoughts.
The main API development in 2017 has been the continued shift towards mainstream API adoption. The concept has been moving outside of the tech sector for a couple years now, but in 2017 it is very clear that it’s not just something startups are doing. This is having a profound shift in how we talk about APIs, and how we approach the API lifecycle. APIs have historically been something new and smaller companies are doing, but often will deliver at scale (AWS, SalesForce, eBay, etc.). The mainstream shift in the API sector brings a whole new set of challenges, and opportunities as existing companies, with existing technology in place, work to shift towards an API way of doing things.
This shift impacts the technology of doing APIs, but really isn’t the main event–things will be business as usual when it comes to the technology of APIs for many years to come. I’d say the main event has been in the business of doing APIs. How these APIs get funded will be entirely different from how startup focused APIs get funded. This shift in financial incentives behind why APIs are developed, operated, and ultimately deprecated, will have profound effects on what is an API. They will have less of the startup shine, and become more robust, providing commercial, and industrial grade digital resources that are more mature than the newer, younger APIs we’ve seen in recent years.
Alongside this shift, another development in the business of APIs has occurred. The funding landscape for startups has shifted substantially. In the last couple years the majority of startup funding has dried up, making it a much more competitive environment for the API providers and service providers startups to get the money they need to grow and evolve. This vacuum has allowed for a new, more volatile, and API driven way of funding to emerge, based upon the blockchain, in the form of Initial Coin Offering, or simply ICO. This approach to raising money is quickly becoming a preferred, albeit a more volatile approach to funding startups. It is something that will work against the reliability and stability of depending on APIs, even more so than we saw with startup and investment culture over the last 7 years.
These two shifts in the business of APIs will be at odds with each other. There will still be startups doing APIs, but they will often be more volatile, ephemeral, and unreliable, but they will still continue to define what is next. While the mainstream adoption of APIs will become a more mature, stable, sustainable approach to doing business with APIs. Providing the services, tooling, and resources that are needed to make the economy work in the Internet age. This shift in the business of APIs landscape will have positive and negative effects on the API sector, but ultimately working together to move things forward in a more sensible way than we have in the wild west of the API sector. We will als begin to see more standards emerge, and more industry leaders who dominate their sector because they’ve found a sustainable way of doing APIs, that transcends much of the hype we’ve seen, and will continue to see in the startup community.
I am spending two days this week with the Capital One DevExchange team outside of Washington DC, and they’ve provided me with a list of questions for one of our sessions, which they will be recording for internal use. To prepare, I wanted to work through my thoughts, and make sure each of these answers were on the tip of my tongue–here is one of those questions, along with my thoughts.
There are always endless numbers of fabricated unsolved problems in the API space. These are unsolved problems that are usually unsolved because they were just made up to get someone to buy a new service or product. They aren’t real problems. Technology is good at being applied to make believe problems, vendor fabricated problems, and solving real problems created by the last couple of waves of technology. I’d say that biggest unsolved problem is how APIs can actually solve the legacy technical debt challenges large companies, institutions, and government agencies face. There is a lot of rhetoric regarding how APis can unwind all of this, but honestly there are few examples of it in practice. With many API efforts in 2017, bogged down in cultural friction, and a web (pun intended) of technical complexity.
One aspect of this problem of legacy technical debt is the problem of delivering technology in an Internet age, without actually embracing the web. People doing APIs don’t always have the knowledge of the web, and what makes it work before they get to work doing APIs. Vendors are offering up tools and services that provide web API solutions, but don’t always embrace the web, and the existing standards and protocols that make the web work. APIs are the next evolution of the web, and rely on many of the same concepts for it to work. When you ignore the web when doing APIs, you will always face challenges in interoperability, and reuse, and often end up building siloed solutions that do not achieve deliver the solutions that doing APIs have promised.
I regularly see API providers, and service providers delivering their solutions to problems using APIs, with no awareness of the web they are relying on, and the building blocks that make it work. We often see technological solutions that can potentially deliver value, but are done so in a proprietary way, with all roads leading towards a commercial solution, extracting any value from open source implementations, and the web, while not giving anything back. All of this behavior is just leading to the next generation of technical debt, that is being sold as the solution for the last couple of generations of technical debt. When in reality, everyone wants to just operate efficiently, securely, and effectively using the web, but rarely do they ever fully invest in learning what that means, and working with vendors who support this vision.
It is easy to say the biggest unsolved problems are the new ones, but in my opinion, this is lazy. New technology will always become yesterday’s old technology. in the race to adopt each new wave of technological solution it is easy to pass the biggest existing problems off to other folks, and pay attention to the new and shiney technology. This is easy. Yes it just contributes to the problem, and rarely actually provides comprehensive, equitable solutions that will make a difference unwinding the technical debt we’ve already built up. So I tend not to get distracted with the endless waves of made up problems you read about daily. I think the biggest unsolved problem in the space is technical debt, and how we unwind it as we move forward. This isn’t just about adopting new acronyms, phrases, terms, and technology. It is often about deeply considering the why and how we are doing this, and having open conversations about what technology we should be doing, and shouldn’t be doing. Then doing it in small enough chunks that it can be eliminated as soon as it shifts from asset to a liability, and either eliminated or replaced with something that works.
The number API that gets me out of bed each day, with an opportunity to apply what I’ve learned in the API sector is with the Human Services Data API (HSDA). Which is an open API standard I am the technical lead for which helps municipalities, and human service organizations better share information that helps people find services in their communities. This begins with the basics like food, housing, and healthcare, but in recent months I’m seeing the standard get applied in disaster scenarios like Hurricane Irma to help organize shelter information. This is why I do APIs. The project is always struggling for funding, and is something I do mostly for free, with small paychecks whenever we receive grants, or find projects where we can deliver an actual API on the ground.
Next, I’d say it is government APIs at the municipal, state, but mostly at the federal levels. I was a Presidential Innovation Fellow in the Obama administration, helping federal agencies publish their open data assets, take inventory of their web services. I don’t work for the government anymore, but it doesn’t mean the work hasn’t stopped. I’m regularly working on projects to ensure RFPs, and RFIs have appropriate API language in them, and talking with agencies about their API strategy, helping educate them what is going on in the private sector, and often times even across other government agencies. APIs like the new FOIA API, Recreational Information Database API, Regulations.gov, IRS APis, and others will have the biggest impact on our economy and our lives in my opinion, so I make sure to invest a considerable amount of time here whenever I can.
After that, working with API education and awareness at higher educational institutions is one my passions and interest. My partner in crime Audrey Watters has a site called Hack Education, where she covers how technology is applied in education, so I find my work often overlapping with her efforts. A portion of these conversations involve APIs at the institutional level, and working with campus IT, but mostly it about how the Facebook, Twitter, WordPress, Dropbox, Google, and other public APIs can be used in the classroom. My partner and I are dedicated to understanding the privacy implications of technology, and how APIs can be leveraged to give students and faculty more control over their data and content. We work regularly to tell stories, give talks, and conduct workshops that help folks understand what is possible at the intersection of APIs and education.
After that, I’d say the main stream API sector keeps me interested. I’m not that interested in the whole startup game, but I do find a significant amount of inspiration from studying the API pioneers like SalesForce and Amazon, and social platforms like Twitter and Facebook. As well as the cool kids like Twilio, Stripe, and Slack. I enjoy learning from these API leaders, studying their approaches, but where I find the most value is sharing these stories with folks in SMB, SME, and the enterprise. These are the real-world stories I thrive on, and enjoy retelling as part of my work on API Evangelist. I’m a technologist, so the technology of doing APIs can be compelling, and the business of doing this has some interesting aspects, but it’s mostly the politics of doing APIs that intrigues me. This primarily involves the politics of the humans involved within a company, or industry, providing what I always find to be the biggest challenges of doing APIs.
In all of these areas, what actually gets me up each day, is being able to tell stories. I’ve written about 3,000 blog posts on API Evangelist in seven years. I work to publish 3-5 posts each weekday, with some gaps in there due to life getting in the way. I enjoy writing about what I’m learning each day, showcasing the healthy practices I find in my research, and calling out the unhealthy practices I regularly come across. This is one of the reasons I find it so hard to take a regular job in the space, as most companies are looking to impose restrictions, or editorial control over my storytelling. This is something that would lead to me not really wanting to get up each day, and is the number one reason I don’t work in government, resulting in me pushing to make change from the outside-in. Storytelling is the most important tool in my toolbox, and it should be in every API providers as well.
When it comes to the most influential people and companies in the API space that I am keeping an eye on, it always starts with the API pioneers. This begins with SalesForce, eBay, and Amazon. Then it moves into the social realm with Twitter and Facebook. All of these providers are still moving and shaking the space when it comes to APIs, and operating viable API platforms that dominate in their sector. While I do not always agree with the direction these platforms are taking, they continue to provide a wealth of healthy, and bad practices we should all be considering as part of our own API operations, even if we aren’t doing it at a similar scale.
Secondarily, I always recommend studying the cloud giants. Amazon is definitely the leader in this space, with their pioneering, first mover status, but Google is in a close second, and enjoys some API pioneering credentials with Google Maps, and other services in their stack. Even though Microsoft waiting so long to jump into the game I wouldn’t discount them from being an API mover and shaker with their Azure platform making all the right moves in the last couple of years as they played catch up. These three API providers are dictating much of what we know as being APIs in 2017, and will continue to do so in coming years. They will be leading the conversation, as well as sucking the oxygen out of other conversations they do not think are worthy. If you aren’t paying attention to the cloud APIs, you won’t remain competitive, no matter how well you do APIs.
Next, I always recommend you study the cool kids of APIs. Learning about how Twilio, Stripe, SendGrid, Keen, and the other API-first movers and shakers are doing what they do. These platforms are the gold standard when it comes to how you handle the technical, business, and politics of API operations. You can spend weeks in their platforms learning from how they craft their APIs, and operate their communities. These companies are all offering viable resources using web APIs, that developers need. They are offering these resources up in a way that is useful, inviting, and supportive of their consumers. They are actively investing in their API community, keeping in sync with what they are needing to be successful. It doesn’t matter which industry you are operating in, you should be paying attention to these companies, and learning from them on a regular basis.
The biggest challenge for big companies doing APIs is always about people and culture. Change is hard. Decoupling things at large companies is difficult. While APIs can operate at scale, they excel when they do one thing well, and aren’t burdened with the scope, and complexity of much of the software systems we see already operating within large companies. These large systems take large teams of people to operate, and shifting this culture, and displacing these people isn’t going to happen easily. People are naturally skeptical of new approaches, and get very defensive when it comes to their data, content, and other digital assets, as it can be seen as a threat to their livelihood–opening up and sharing these resources outside their sphere of influence.
The culture that has been established at large companies won’t be easily undone. It is a culture that historically has had a pretty large gulf between business groups, and the IT groups who delivered the last generation of APIs (web services), that weren’t meant to be accessible, and understandable by business users. Web APIs have become simpler, more intuitive, and have the potential to be owned, consumed, and even in some cases deployed by business users. Even with this potential, many of the legacy rifts still exist, and business users feel this isn’t their domain, and IT and developer groups often feel APIs are something that should stay in their domain–perpetuating and confusing existing challenges already in place.
While there may be small API success within large companies, they often experience significant roadblocks when they try to scale, or spread to other groups. A huge investment is needed in API training amongst not just business users, but also developer and IT groups who may not have the experience with the web that is needed to make an API program successful. This can be the most costly and time-consuming aspect of doing APIs, and with many APIs being born out of technical groups, and are often under-funded, experimental efforts, investment in the basics of web literacy, and API training is often anemic. Setting the stage for what is happening when you begin unraveling legacy systems and processes is essential to minimize friction across API implementations. Without it, humans and culture will be your biggest obstacles to API success.
Web literacy, and API training really isn’t much different than other areas where corporate training is being applied, but for some reason many companies just expect the technology folks to know what they know already, or problem solve and learn on the job. This might have been find when things got done in purely technical circles, but web APIs aren’t purely about tech. They are about leveraging the web to solve problems people face within a company, getting access to resources, and working with external partners to help move business forward. IT and developer staff aren’t always ready for this type of external facing roles, and if business users aren’t up to speed on what is needed, API implementations will stumble, sputter, and ultimately fail. Think of the partnerships it’s taken to make the web work at your company, everyone is using the web, why should it be different with APIs? If APIs are done right, and people are properly educated, there is no reason an entire group can’t work in concert.
Every API effort I’ve seen fail had one common road block–people. There were IT groups that sabotaged, sales teams that felt threatened, executive leadership who didn’t understand what was happening, or partners who were in proper alignment with API efforts. Sure, sometimes the challenges are purely technical. Lack of proper API design. Insufficient security or capacity. These are simply API training and education issues as well. You can’t throw the need for integration of resources between internal groups, external partners, or 3rd party developers using the web at any technical group and expect them to understand what is needed. Similarly, you can’t mandate APIs across business groups, and just expect them to get on-board without any friction. Invest in the web literacy skills, API training and awareness, and communication skills that will be required to do APIs right, and the chances your API efforts will succeed will greatly increase.
The biggest change in the industry since I started doing API Evangelist in 2010 is who is doing APIs. In 2010 it was 95% startups doing APIs, with a handful of enterprise, and small businesses doing them. I’d say over the last couple years the biggest change is that this had spread beyond the startup community and is something we see across companies, organizations, institutions, and government agencies of all shapes and sizes. Granted, there are a variety when it comes to the level they are doing them, and the quality, but APIs are something that has been moving mainstream over the last seven years, and becoming more commonplace in many different industries.
In 2010 it was all about Twitter, Facebook, Amazon, and many of the API pioneers. This has been rapidly shifting to each wave of startups like Twilio, Stripe, Slack, and others. However, now in 2017 I am seeing insurance companies, airlines, car companies, universities, cities, and federal agencies with API programs. I mean, c’mon, Capital One has an API program (wink, wink). While I still hold influence with each wave of API service providers looking to sell to the space, and many of the API startup providers, my main audience is folks on the frontline of the enterprise, and government agencies at all levels. I also have a growing number of people at higher educational institutions tuning into what I’m writing as they look to evolve their approach to technology. APIs were mainly a startup thing in 2010, and in 2017 it is about getting business done in a digital age thing.
The technology of APIs is still expanding and we are seeing things push beyond just REST, and web APIs, but by far the biggest change has been more about the business of doing APIs, and more importantly sometimes, the politics of doing APIs. These are areas of the industry that are rapidly expanding and evolving as new people onboard with the concept of an API, and the opportunity for doing APIs. As we add new companies, organizations, institutions, agencies, and industries to API conversation, the technology of APIs hasn’t shifted to much, but the business and political landscape is flexing, shifting, and evolving at a pretty rapid pace, and it is something that isn’t always a good thing. Along with it comes privacy, security, financial, and other challenges that will only get worse if there isn’t more discussion and collective investment.
The shift I’ve seen between 2010 and 2017, feels a lot like the change I witnessed from 1995 to 2002 with the web, but this time it’s more than just about websites, it is also about mobile applications, devices, conversational interfaces, automation, and much more. Honestly, it is simply just the next evolution of the web, where there is significantly more channels for operating on than just a browser, and there is a growing amount of digital assets being distributed via the web beyond just text, and images. Video has picked up speed, voice and audio are finally maturing, and algorithms, machine learning, and artificial intelligence are seeing a significant uptick. While all of these areas will have their impact, the biggest changes will come from leading industries like healthcare, education, banking, transportation, and others going beyond just dipping their toes in the API space, but baking it into everything they do.
|<< Prev||Next >>|