The cold start, part 2

OLAFUR ELIASSON – H2R car, 2006: a biomorphic sculpture that mixes metallic structure and frozen bodywork

In the first half of this post, I looked at why producer and consumer engagement in data sharing is the single most important factor in jump starting the circular economy.

In the second part, I want to look at the shared data that can be accessed in the short term that moves us toward that all important engagement.

What data is important, NOW?

Not all circular economy business models require the same degree of data capture. This means that some loops can progress more easily.

Katherine Whalen, of the RIISE institute in Gothenburg makes an important distinction between circular business models that extend resource vale and those that extend product value.

Where the products have a relatively low level of material complexity (packaging, textiles or plastics, for example), circular business models are about extending resource value. These are issues of recycling and recovery. For producers, value is captured through reducing material costs compared to virgin materials through maximising the efficiency of collection and reprocessing. The model provided by Worn Again is a great example.

Consumer engagement is critical but it doesn’t require the same level of data sharing that circular business models based upon extending product value will need.

The circular business models for more complex products (household electricals, electronics, industrial machinery, cars, etc.) are based upon a new form of customer relationship. If we are to ‘consume’ our washing machines on a ‘per wash’ basis or if we want to use products that are repaired and upgraded rather than made obsolete, then our relationship with the product and its producer needs to be much more interactive that we’re used to. These are what Katherine describes as business models that add value through a high degree of ‘firm-product interaction’.

This is a particular problem for the circular economy as achieving the aim of making material streams visible means that we must have visibility of their composition, condition and location. Most importantly, this includes the latent materials that are still in use.

Achieving circularity at scale will require 7 data sets gathered from primary sources (directly from value chain actors about specific materials and processes). These 7 primary sources are all identified as nodes in the Circular Challenges Systems Map, they all have specific sets of influences and dependencies and some play a more pivotal role in kick-starting progress than others.

Once we have these data streams, making them accessible to appropriate stakeholders is easy (with all due respect to my developer friends). Getting people to willingly populate the information required for a product passport, isn’t.

The circular economy’s cold start problem

The ‘cold start problem’ is one that faces many early stage AI and machine learning products.

The promise of artificial intelligence is that it gives us game changing powers of prediction, analysis and decision making, through the ability to wrangle ‘big data’. The more data that is collected, the more useful, and the more valuable an AI service becomes – this is known as the “data network effect”.

Huge amounts of data collected from primary sources in the real world, means that those that contribute to that data pool benefit from the insights that clever AI systems drag out of it. Once started, the data network effect should gather momentum and become self a self-perpetuating source of value for the AI company and its users alike. Many refer to it as the ‘flywheel’.

But, how do you start this avalanche of benefits when there is little data in the system? Why should those that have the data be first to contribute their precious resource to the pool, when they get so little back from doing so?

Most successful AI businesses have got past this problem by using a mixture of 4 strategies:

Using synthetic data
Public data
Strategic partnerships
Indirect crowdsourcing

Synthetic data

Many software start-ups overlook the value of synthetic data. The fact that it’s more often referred as the more derogative ‘dummy data’ shows the attitude that is often taken to it.

When marketing their product, it’s quite common for software companies to provide a free trial version or a ‘sandbox’ version in which a prospect can can ‘try’ the product. These are often disappointing. Indeed, in my experience, they’re usually counter-productive and I have often advised clients to drop them. One of the reasons that they fail is that they usually require the prospective user to gather their own data, forgetting that this can be quite a task in itself.

However, when these demonstration versions are pre-populated by credible synthetic data, then the user can begin to see the potential benefits that might justify the task of gathering their real data. An engaged potential user may then be motivated to explore scenarios by substituting ‘dummy’ data with what they think would be applicable to their own business.

As seen from previous posts, linear lock-in (“difficulties identifying viable business models in yet linear systems”) is a real problem. Many producers simply don’t know where to start, so any technology provider marketing a solution that they claim can provide a means by which users can realise the value of their waste really should invest time in creating a large volume of synthetic data. Done well, the product will already provide valuable (as in chargeable) insight to a customer even before they have added any of their real data.

Also, seeing the way in which this synthetic data is used and presented might do much to alleviate concerns over how any data that the user enters is going to be accessed and used once it is shared.

Public data

Like ‘dummy’, ‘scraping’ is not a word that people always feel comfortable with, but scraping social media, forums, open data networks, and public databases to gather data can be a powerful tool. Some leading technology providers in the ESG reporting field have already shown that using AI to gather and organise tis data can provide valuable insights. Datamaran for example “processes thousands of corporate reports (annual reports, sustainability reports and SEC filings), regulations and policies, news and social media. Using NLP (Natural Language Processing), it then analyzes a dynamic ontology of ESG issues across these sources to identify key trends.”

Tools like this have the potential to make the data that is publicly (often freely) available accessible and applicable. For example, when developing solutions and business models to tackle the issue of materials lost through e-waste, a great place to begin is the ProSUM project’s Urban Mine Platform – already a valuable resource for building processor engagement. All it lacks are tools that can process this raw data in a way that makes it more insightful to processors looking to build a business case around material recovery.

Limited those this data might be, by using it creatively we can progress and get a far better insight into what additional data is really needed. In my experience, some companies with data gathering solutions around product sustainability would do well to learn from ‘Better Call Saul”s Jimmy McGill

“Perfection is the enemy of perfectly adequate”

Strategic partnerships

Another thing that I have seen with new AI companies is that they can be very wary of forming partnerships with others. Because they are usually early stage entrepreneurs, they fear working with more established enterprises may risk their competitiveness, independence or IP. This is a shame because many opportunities can be missed that way.

Some AI companies, particularly in the Pharma sector, have grown by forming partnerships with existing companies that have already generated a great deal of proprietary data, but either have a different use for it or lack the incentive to generate insights from it. Under these circumstances, partnerships can be formed in which the AI company licenses the data or uses it to generate some other value for the data owning company and its customers.

We should ask ourselves, which companies are already gathering data about the condition, location and availability of products in their use phase? Many companies, both solution providers and accountancy firms maintain asset registries for their clients, but very few would be seeing this data as having a value in a circular economy sense. The opportunities for that data to be compiled and processed are there to be taken, either as an intrapreneurial project by those companies or as a partnership opportunity for a third party.

Does this information need to be made available at a level of detail that betrays confidentiality? No.

Does this information contain a complete bill of materials (BoM) for those assets? No.

BoM data is ultimately important but, at this stage, the type of asset alone is a great start. If we had the manufacturer and model details, then we have the basis for showing those manufacturers how and why BoM data would be used by other stakeholders and how sharing it in a controlled way will ultimately benefit them too. The asset may have a useful life of >5 years; if BoM data could be added to the assets ‘product passport’ at some point during that period, then the processor that deals with the next stage of its lifecycle will have what they need and when they need it.

Crowdsourcing

In the end, there can be no substitute for specific data sourced directly from actors in the value chain. There are reasons that actors are wary of sharing data but the lack of a compelling incentive is the primary one. The visionary cause of furthering the circular economy isn’t enough in itself, nor is the promise of medium term efficiency benefits. Sharing data must have an immediate return for those willing to give it.

Which makes me wonder whether we are being creative enough in the ways that we are asking ‘the crowd’ to share data. It need not be ostensibly and obviously linked to the circular economy. Again, this is an example that others may find uncomfortable but we need only look to the ways in which social media platforms like Facebook glean valuable data from users by asking the right kind of questions.

Think about that in a wider sense, even in a strictly B2B value chain, are we asking the right questions?

For nearly 30 years, the way that we have addressed issues of supplier engagement on compliance and sustainability has been to ask endless questions and this effort has just created a newer version of business as usual. The circular economy demands that the questions we ask of the value chain in future clearly reflect the mutual dependency of each actor; they’re not just a stick to beat the supply chain with.

One comment

Cynical or sentimental? – Adrian Segens says:

June 5, 2020 at 12:11 pm

[…] I discussed in an earlier post about overcoming the cold start problem, there are ways in which AI companies can overcome the daunting prospect of data collection so that […]

LikeLike

Adrian Segens

squaring the circle

The cold start, part 2

What data is important, NOW?

The circular economy’s cold start problem

Synthetic data

Public data

Strategic partnerships

Crowdsourcing

One comment

Leave a comment Cancel reply

The cold start, part 2

What data is important, NOW?

The circular economy’s cold start problem

Synthetic data

Public data

Strategic partnerships

Crowdsourcing

Share this:

Related

One comment

Leave a comment Cancel reply