“Is this t-shirt real?” On e-commerce and photo-editing
What’s the latest item you have bought online? A lazy-chair? A t-shirt? A cell phone? You might have chosen the top option provided by the marketplace or wandered in the cornucopia of virtual marketplaces. If so, there are good chances the visual was among the first criteria for your choice.
What if I told you there are good chances this visual wasn’t real?
Or, to be more accurate, somewhere on the spectrum from “heavily retouched” to “completely 3D”. Let’s explore the world of e-commerce, where textures and hues are not what they seem — and look at the tools propelling this.
In e-commerce, visual is key
In a brick and mortar shop, all your senses are activated, and the shop designers take great care in choosing even the music and the smell. In e-commerce, all you have is vision (alright, and some customer reviews, but that’s for another article). The image is the product. In eye-tracking studies for e-commerce, the movement of the eyes is recorded in milliseconds: that’s the time a potential buyer will spend on a thumbnail before clicking.
Research firm “eMarketer” estimates that worldwide retail e-commerce sales increased 27.6% in 2020, for a total of $4.280 trillion. Covid-19 drove a lot of this growth, but e-commerce will remain a fixture for brands, especially in growing markets such as China and Latin America.
Understandably, this has opened a large B2B market to make the lives of virtual sellers, both corporate and indy, easier. The time of haphazard photo angles and home backgrounds is over.
Just as the shopping funnel is standardised using tools such as Shopify, so is the visual grammar of items simplified: alone, on a white background. This is by far the most common representation of objects in e-commerce. You rarely see objects being used in their main image, as was sometimes the case in mail-order catalogs.
Among a dozen of very specific image requirements, Amazon requires that a white background is applied to all products. eBay still simply asks that you take the photographs yourself. Could this contrasting visual approach be part of the explanation of the diverging destinies of the two marketplaces?
White-on-white
The presentation rules enforced by platforms mean that sellers have to follow certain standards. For a number of products and picture combinations that might quickly run up to the hundreds, if not thousands. Sellers are therefore looking to streamline the process of picture taking and editing and reduce costs.
Larger companies can do this in-house, but it quickly becomes tricky for smaller sellers. An experienced freelance product photographer might charge $50/hour, with or without basic editing. Alternatively, a seller might take the photographs themselves and hope services such as Fixthephoto will deliver on their promise. $2 per photo can guarantee background removal for simple objects in a 24h turn around. This seems like a bargain, unless you have many pictures to edit, or if you want more complex modifications. A seller might want to quickly change the color of a couch, increase the size of a shoe, even iron a shirt. This is much more time-consuming for the photoshop artist — and costlier.
This is where automation steps in, with companies developing software leveraging machine learning to retouch and improve photos. The goal is also to integrate directly with e-commerce platforms.
This t-shirt doesn’t exist
As we go further into the digital realm, and further away from the “original” object, we enter the world of 3D modeling. 3D modeling has a lot going for it: flexibility, quick turnaround for edits, stream-lined cost… Furniture behemoth Ikea started including computer-generated photographs in its catalog in 2014. Nobody noticed the difference, and 3D renders actually scored better internally on quality in regards to “real” photographs. Ikea then continued developing 3D models, with an estimated 70% of its catalog now made of “virtual” renders.
For a company such as Ikea, going digital is a no-brainer. To create its catalog, the firm used to ship in all the furniture to create each room to a huge shooting depot, construct all the furniture composing the room (and we all know there is always a missing nail, right there at the end), photograph it, and then… destroy it to shoot the next room. Moreover, as they operate globally, they need to adapt to each culture: an American kitchen is not the same as a Swedish, or Chinese kitchen. You can imagine the cost, and waste, such a protocol creates.
Ikea does all this digital shooting in-house, with a dedicated team and technology. Could smaller online sellers afford to do this? Once the model is created, it can be “photographed” in virtually any variety of settings, colors, and textures easily. But the catch lies in “once the model is created”: this is still an expensive process, as demonstrated by the price tag of 3D modeling companies such as zerolens, whose pricing starts at €150/month for 10 pictures, with the user actually providing the 3D models.
We, therefore, have on the one side, “traditional” photography, powered by platforms such as Meero or Ocus, which provides high-quality visuals but doesn’t scale well, and technological solutions that require skill and/or a lot of cash.
What’s missing is a middle ground, an automated retouching service that can work with pictures taken on a smartphone, by non-specialists. This is what we hope to achieve with ClipDrop: a tool to support sellers, with a controlled cost and time investment.
Going further
A technology I’m particularly looking forward to seeing applied to e-commerce is neural rendering. A. Tewari & O. Fried & J. Thies et al. define it as “deep image or video generation approaches that enable explicit or implicit control of scene properties such as illumination, camera parameters, pose, geometry, appearance, and semantic structure.” For the moment, these are research-stage technologies, allowing users to edit skin to remove tattoos, calculate the cost for animating a face, swap backgrounds or relight a portrait for background replacement.
I could give many more examples, as these technologies are extremely versatile. And all could have potential applications in e-commerce, especially when it comes to backgrounds. But there is a wide delta between a technology designed in a research lab and its accessibility to millions of non-technical users.
To see e-commerce sellers embrace these tools, strong business cases need to be developed, based on solid, specific datasets (and those are still missing). At Clipdrop, we are working with e-commerce platforms and sellers to put our technologies at their service and achieve exactly that: scalable neural rendering for e-commerce.
Deep editing, not shallow ethics
Did you know Ikea catalog images were 3D renders, not actual photos? I didn’t. Does it matter? As long as the objective isn’t fraud, that is to transmit a voluntarily distorted image of the product, I think not. Ikea even includes imperfections in the renders, so that they look… more real. And even unretouched, a carefully lit and angled photograph can greatly “improve” the aspect of a product.
Of course, matters are not always so clean-cut, and the question of ethics in synthetic media is not to be dismissed. And I think some of the issues raised by deep fakes, such as authenticity and copyright, could trickle down to e-commerce. If a retailer uses the “digital” double of a celebrity to model its clothes, should it pay them? If several companies use the same generative models and get similar backgrounds as a result, could they sue each other? Similar to the (rather inefficient) French law requiring all retouched photographs of fashion models to be labeled as such, could some platforms or customers start demanding unretouched photographs?
Sooner or later, these questions will emerge, and I think that’s a good thing. It will mean neural rendering is out of the lab, and into the world.
Looking forward to it.