Humanoid Robots in the Workforce

PLUS: DragGAN image editing, Mind reading AI, and Photoshop's new superpowers

Welcome to the 10th issue of the AstroFeather AI newsletter!

This was another exciting week for AI. Robotics startups have unveiled their AI-driven humanoid robots ready to be integrated into the workforce, a research lab introduced an amazing looking “click-and-drag” image editing technique, Photoshop has some new AI features, and OpenAI has clearly been thinking a lot about artificial superintelligence. You’ll find these trending stories and more covered in this issue!

In today’s recap (10 min read time):

  • Autonomous Humanoid Robots are Entering the Workforce.

  • (Research) Click-and-drag image editing and Generating video from brain activity.

  • Product Previews and Launches.

  • Company Announcements and News Throughout the Industry.

Must-Read News Articles and Updates

Update #1. Autonomous Humanoid Robots are Entering the Workforce.

EVE Robots. Image 1X (formerly Halodi)

The latest: Robotics startup 1X (formerly known as Halodi) is the first to successfully deploy humanoid robots in a professional (real-world) setting. According to 1X, its humanoid robots, named EVE, have been patrolling as security guards at select locations in Europe (Norway) and the U.S. (Dallas) since April of this year, with plans to deploy them in hospices and assisted living facilities.

How EVE works: EVE is a self-balancing, fully mobile, human-size robot with human-like strength and several characteristics that make it ideal for use in unstructured human environments:

  • Mobility and dexterity: EVE's 23 degrees of freedom (a term used to describe the number of independent movements or directions in which a robot can move) and full control of its hands, arms, and joints allow EVE to easily squat down to pick things up off the floor or extend its arms to reach items on warehouse shelves.

    One limitation, however, is that EVE is mounted on a wheelbase and cannot climb stairs. As a result, current EVE models can only be used in facilities with wheelchair-accessible floors, ramps, and easy access to elevators.

  • Human-like strength: The EVE robot can handle objects that weigh ~18 pounds (8 kg) in each arm, which is ideal for retrieving, moving, and organizing a range of small- to medium-sized objects, such as dishes in a cupboard and boxes of items on shelves.

  • Interactivity and connectivity: EVE is equipped with an LED "face" that allows it to show reactions to people. It also has 3D vision and audio for environmental mapping, allowing it to interact with objects, people, and pets, as well as navigate areas that include hallways and corridors.

    Although EVE can navigate autonomously, it can also be remotely controlled by an operator who can "talk through" EVE using the robot's built-in speaker.

Driving the news: While most people are familiar with Atlas from BostonDynamics or Optimus from Tesla, it is worth noting that several humanoid robotics startups have either recently emerged from stealth mode (a term used to describe a startup's temporary state of secrecy and avoidance of public attention) or closed multimillion-dollar funding rounds, indicating a growing interest in deploying humanoid robots in real-world settings.

Phoenix Robot Organizing Clothing. Image: Sanctuary AI

Phoenix from Sanctuary AI: Vancouver-based Sanctuary AI recently unveiled its Phoenix humanoid robot. The 5 ft 7 in (~174 cm) bipedal robot weighs 155 lbs (70 kg), can lift objects up to 55 pounds (~23 kg), and travels at 3 miles per hour (~5 kilometers per hour).

In March, Phoenix was deployed in a retail store where it successfully completed several retail-related tasks. To date, Sanctuary AI has raised more than $100 million in funding from various sources, including a Series A round and the Canadian government.

Figure 01 from Figure: Sunnyvale-based Figure emerged from stealth earlier this year and recently raised $70 million in a Series A funding round to continue developing general-purpose autonomous humanoid robots, with the goal of deploying its Figure 01 model in various environments ranging from warehouses to retail.

Figure 01 stands 5 ft 6 in (~168 cm), is bipedal, weighs 132 lbs (60 kg), can lift objects up to 44 lbs (20 kg), and travels at 2.7 miles per hour (4.3 kilometers per hour).

Astra from Apptronik: Although Austin-based Apptronik is working on several autonomous robot prototypes, it appears that the company will announce Astra as its first robot to market. Astra is unique in that it's only an upper body (and has no legs or wheelbase) but can reportedly be mounted on a mobile base unit.

According to the company, Astra is designed for manipulation rather than dynamic locomotion and movement between locations. A compilation video shows Astra's ability to grip objects of various sizes and organize them into a standard packing box.

Why it matters: In general, labor shortages have been exacerbated by the COVID-19 crisis, certain industries such as warehousing and manufacturing are experiencing high employee turnover, and there is an increased need for caregivers to support a growing elderly population.

Fully functional autonomous humanoid robots could be a viable answer to these challenges, quickly filling labor gaps when and where needed.

Additional Links for “Autonomous Humanoid Robots are Entering the Workforce”:

Update #2 (Research): Click-and-drag image editing and Generating video from brain activity.

Click-and-drag Image Editing with DragGAN. Image: X. Pan

DragGAN lets you reshape images by clicking and dragging: A group of researchers from Google, MIT, and the University of Pennsylvania have developed a new AI tool called DragGAN that allows users to interactively morph photos, images, and other works of art by simply using a mouse cursor to select interactive points on an image and drag them from one location to another.

How it works: When I first saw DragGAN in action, I thought it was simply warping images by smudging existing pixels, like Photoshop's Warp tool. Interestingly, however, the AI model uses GANs (generative adversarial networks) to regenerate the underlying object based on the new location of those interactive points I mentioned earlier.

DragGAN works by 1) using feature-based motion supervision that helps move any point in the image to a user-specified position, and 2) a new point-tracking approach that helps keep track of the position of these interactive points. As a result, users can "deform an image and have precise control over where every pixel ends up."

Observations and results: Videos on the research team's site show the tool's ability to change the expression, orientation, pose, and dimensions of any subject in a photo (including cars, animals, people, and landscapes). One of the demos even shows a dog being made chubbier, its mouth opened wider, and its demeanor changed by simply dragging those magical interactive points around!

Why it matters: DragGAN represents a step change in image and photo editing. The system's intuitive interface and support for "click-and-drag" actions makes complex image editing accessible to a wider group of users compared to current industry standard applications.

MinD-Video reconstructs videos from brain scans. Image: J. Qing

MinD-Video converts human thoughts into videos: A team of scientists from the National University of Singapore and the Chinese University of Hong Kong have used an fMRI decoder and the Stable Diffusion AI image generator to create an AI model called MinD-Video that generates high-quality video from human (fMRI) brain scans.

How it’s built: The MinD-Video system, defined by the research team as a "two-module pipeline designed to bridge the gap between image and video brain decoding," consists of a trained fMRI decoder and a fine-tuned version of Stable Diffusion.

To train MinD-Video, the research team used a publicly available dataset containing videos and fMRI brain scans of the subjects who watched the selected videos.

Observations and results: The research team found that the videos reconstructed using MinD-Video were "high quality" as defined by motion and scene dynamics. They also reported that the reconstructed videos had an accuracy of about 85% (when compared to the source material that the test subjects watched).

The research team also shared several comparison videos on their project page. In one demo, a video of horses in a field (viewed by a test subject) is compared to a reconstructed video of a more vividly colored version of the horses. In another demo, a video of a car driving through a wooded area is shown, and the reconstructed video shows a first-person point of view (POV) of someone driving down a winding road.

Why it matters: This study represents another advance in the efforts for using AI to "read" people's minds, and its results are an extension of the team's previous AI model, called MinD-Vis, which can convert human brain scans into images.

Though still in its infancy, the technology could have numerous applications, including helping disabled patients communicate what they see and think.

Additional Links for “Click-and-drag image editing and Generating video from brain activity”:

Update #3. Product Previews and Launches.

Top Picks: Microsoft Build Event and Generative Fill (Adobe Firefly) for Photoshop

Generative Fill Inpainting. Image: Adobe

Adobe brings generative AI (GenAI) to Photoshop: Adobe recently introduced a new GenAI feature called Generative Fill that lets users quickly extend images and add or remove objects with text prompts. The new feature, which has been integrated into Photoshop, works within individual layers of a Photoshop image file and is like Photoshop's existing Content-Aware Fill feature, but gives users more control.

Generative Fill is based on the Adobe Firefly image generator model and uses a well-known AI technique called "inpainting" to seamlessly blend AI-generated imagery into an existing image. Conversely, Generative Fill uses another AI technique called "outpainting" to remove objects from a scene or erase parts of an image.

Microsoft Build 2023. Image: Microsoft

Microsoft rolls out more AI at its “Build” event: Microsoft made several announcements at its AI-focused Build event, and spoiler-alert, they’re incorporating AI into as many applications and services as possible. Here’s a list of some of the major announcements:

Additional Trending Product Launches

Skybox AI: Blockade Labs has introduced a sketch mode for its Skybox AI image generator, allowing users to create environments based on the lines they draw and text prompts. Users can choose from several tools, guides, and styles to control Skybox generations.

Aria for Opera: Opera has launched its AI side-panel assistant called Aria, which is based on ChatGPT and works as a web and browser assistant to help find web information, generate text or code, and get answers to product questions.

Google Flamingo for YouTube Shorts: Google DeepMind is using its visual language model, Flamingo, to generate descriptions for YouTube Shorts to improve discoverability. Flamingo works by analyzing the initial frames of a video to explain what is happening, then storing the descriptions as metadata to better categorize videos and match search results.

Additional Links for “Product Previews and Launches”:

Update #4. Company Announcements and News Throughout the Industry.

Top Pick: OpenAI’s Proposes Governance for Superintelligence

Illustration: Justin Jay Wang × DALL·E

OpenAI Superintelligence Governance: OpenAI leaders have called for international regulation of "superintelligent" AI, urging the creation of an international regulatory body equivalent to the International Atomic Energy Agency (IAEA). This regulatory body would inspect AI systems, require audits, test for compliance with safety standards, and impose restrictions on deployment.

The company's co-founders and CEO also called for a degree of coordination among companies working at the forefront of AI research. They warned of the risks of superintelligence, including the possibility of losing the ability to self-govern and becoming dependent on machines.

Anthropic Raises $450M: Anthropic, a generative AI startup co-founded by OpenAI veterans, has raised $450 million in a Series C round, bringing its total funding to $1.45 billion. Anthropic is most recently known for its ChatGPT competitor, Claude, and its " Constitutional AI" training approach for building safe foundation models.

Nvidia Shares Soar: Nvidia has announced its Q2 revenue forecast, which is more than 50% higher than Wall Street's estimates, with sales of AI chips surging in demand. As a result, Nvidia's shares jumped ~30%, valuing the company at $950 billion, making it the world's most valuable chipmaker.

Additional Links for “Company Announcements and News Throughout the Industry”:

Thanks for reading this issue of the AstroFeather newsletter!

I’m always looking for ways to improve and would love to hear your constructive feedback about the format and content of the newsletter. You can reply to this email, and I’ll be sure to respond.

See you in the next issue!

If you enjoy AstroFeather weekly content, be sure to share this newsletter!

Adides Williams, Founder @ AstroFeather (astrofeather.com)

Join the conversation

or to participate.