AICG

English

Chinese

Japanese

German

The fiery cut fruit ASMR video on the whole network, it turns out to be done like this! (with AI generation tutorial + free channel)
Recently TikTok, Xiaohongshu, youtube, set off a wave of "AI cut fruit ASMR" video craze:
A knife fell gently, and the crystal strawberries snapped and shattered with the crisp sound of breaking glass, relaxing the mind and body for a few seconds, theMillions of views, comments begging for the "original video."The

How exactly are these types of videos made? Which AI tool is used? Is it complicated? Is there a fee?

Today's post.Hands-on with Google Veo 3 Free ProductionThese kinds of videos, zero cost, zero editing basics, just do it 👇 (however, open the Veo 3 (Link requires point magic configuration)

✅ Step 1: Prepare the Prompt (Prompt)

The core of the generated video is text alertIt's like giving instructions to an AI:

"What kind of scene are you shooting, what objects are there, what shots, what sounds"

Don't make up your own cue words, the best way for newbies to do it:Copy, then change, then write.The

🧠 Example prompt word 1 (good for getting started quickly):

Realistic 4K footage close up of a knife rapidly cutting a glowing purple glass peach on a wooden cutting board. Each slice falls apart with a crisp ASMR- style glass shatter sound. style glass shatter sound.

👉 You just need to change "purple glass peach" to the desired fruit, e.g. glass mango / apple / lemon...

🧠 Example Prompt Word 2 (Premium + Multiple Perspectives):

Shot in extreme macro, a flawless, crystal-clear [fruit] rests on a wooden board under warm light. The knife slices it slowly with a clean " ting" sound. Reflections shimmer on the surface, ASMR-style audio layers blend gently in a quiet environment.

👉 [fruit] Replace the object you want to make, e.g. glass watermelon / diamond pineapple etc.

🔄 Quickly generate prompt words (AI help writing recommended):

Let DeepSeek / ChatGPT mimic these structures and make a template where one line of input outputs an entire prompt, for example:

Input: blue glass lemon
Output: a whole paragraph of ve o cue words

✅ Step 2: Generate Video with Veo 3

Veo is Google's latest Text to Video ToolIt has support for 1080P + ASMR sound + multi-angle shooting.

📍 Method 1: Gemini official website (easy to use)

Link: https://gemini.google.com
- Using the Gemini 2.5 Pro model
- Enter the prompt
- Click on the video button → wait for generation
📍 Method 2: Google Labs Flow (customizable)

Link: https://labs.google/flow/
- Switch model to: Veo 3 - Fast (Text to Video)
- Simultaneous generation of 1~4 videos with continuous frames and transitions
- More flexible point consumption and more parameters
🎬 Final advice: don't just play, make accounts!

The ASMR fruit-cutting video isn't a "toy," it's a Traffic Codes + Content ModelsThe

You can make these kinds of videos like the TikTok pop-up bloggers doBatch Generation, Periodic Release, made into an exclusive account.

It is also possible to take the realization path:
- Packaging your generation experience, editing process → doing paid tutorials
- Sell finished materials → Hanging Taobao / Weishop
- Push AI Tools → Affiliate Commissions
These types of videos are easy to make, low-barrier and extremely relaxing, theGreat for short video platforms to post consistentlyThe
July 13, 2025

Found a Github very good AI project Cradle, can control the mouse, keyboard, simulate the human operation, too silky smooth, collection ~ ~ ~ ~

Cradle It is an open-source program by the BAAI-Agents team for General Computer Control (GCC) s multimodal AI Agent framework, which allows large multimodal models to use a variety of software and games like a human via screenshot input and keystroke output.

Common goal: support for any native software (e.g. games, Office, image/video editing tools)
Multi-modal input: screenshot as input, support keyboard and mouse operation as output
Autonomy: Built-in "Cognitive Reflection + Skills Update" module for continuous self-optimization.
Modular design: high controllability and scalability, easy to adapt to new environments

pain point scenario

Since the birth of the GPT series of gurus, LLMs have seen explosive growth. However, they rely on "API text input/output", which makes them unable to control the local interface, and local task automation is still difficult:

Operation of Office, visualization software is limited
Splitting complex tasks makes it difficult to close the loop
Lack of visual ability to locate UI elements based on language alone
Inability to memorize history for long periods of time and insufficient execution of multi-step logic

Cradle is designed to address these pain points:

Controls mouse and keyboard to simulate human operation
Strengthening "self-reflection" and "skill optimization" strategies
Supports long-range tasks, complex gaming environments, and specialized software operations

core functionality

Below is a list of Cradle's 6 core module features:

Information Gathering
- Processing UI screenshots, text messages using visual models
- Audio feedback can be accessed to complete the interoceptive input
Self-Reflection
- Review historical operational results to determine if they were achieved
- Summarize the reasons for failure and provide guidance for the next run
Task Inference
- Inferring current goals based on environment + historical memory
- Dynamic Programming Next Optimal Policy
Skill Curation
- Generate or update skill functions for each task
- Customized strategies by environment for experience
Action Planning
- LLM outputs high-level actions (e.g., "click on X" "move mouse to Y")
- Translation of human-written bridging layers to keystrokes and mouse actions
Memory module (Memory)
- Short-term and long-term memory, including historical records
- Supports reuse of memories and skills across tasks

These modules form a set of closed loops: input screenshot → what you see → introspection → planning → execution → memory feedback.

Experiments have proven that Cradle can be accomplished:

AAA Games:Red Dead Redemption 2 Main quests, high success rate completion;
Municipal Games:Cities: Skylines Creating a City of a Thousand;
Farm Games:Stardew Valley Automatic seeding and harvesting;
Business Game:Dealer's Life 2 Achieve the highest weekly profit of 87%;
Office software: Sign in to Chrome, reply to Outlook, use Feishu;
Editing tools: Meituxiu, CapCut image/video processing.

technical architecture

List of Technical Advantages

Technical Advantages	descriptive
No API Insight at All	Does not rely on internal UI interfaces and adapts to a wide range of software.
Highly modular configuration	Easily scalable to new games or software environments
Progressive capacity enhancement	LLM + self-reflection + memory techniques to support self-improvement
Universal Operating Interface	Screenshots + Keyboard and Mouse Output, Truly Universal

An illustration of the interface

application scenario

R&D AI Agent can autonomously simulate user actions, replacing UI API testing https://wxa.wxs.qq.com/tmpl/mi/base_tmpl.html
Office automation: a large number of repetitive tasks (emails, forms, reports) can be completely automated.
Game AI development: Become an in-game intelligence, test missions/train NPCs
Process Automation: Provides UI automation pipeline with less reliance on traditional RPA
Education and Training: Cradle demonstrates how to do things and helps students understand complex software.

Who's stronger?

Framework project	Support Mode	Whether or not it relies on an API	Key requirements	Core Advantages
Cradle	Screenshots + Keyboarding	❌ No API	Complete closed-loop, self-directed learning	Versatility, Modularity, Wide Adaptation
LangChain Agent	Text API Input/Output	✅ With API	Text commands / HTTP requests	Expertise in information retrieval, text management
AutoHotkey / RPA etc.	keyboard and mouse macro (computing)	❌ No API	Single-step macro operation, lack of memory planning	Easy to use but low intelligence, weak self-improvement
Playwright/Selenium	DOM Manipulation API	✅ DOM API	web automation	Specializes in web, more limited than desktop

Strengths: Cradle is a multimodal, cognitively-enabled "universal software executable" that goes beyond traditional or web automation tools.

Article Summary

Cradle is the first general-purpose software-controlled AI agent.Supports a wide range of local software and AAA game operations.
The core is 6 modules with self-thinking, self-learning, and self-adaptive capabilities.
Modularized and maintainable technical architecture
Compared to traditional tools, Cradle offers a video-quality experience, global closed-loop intelligence, and the ability to create a new, more efficient, and more effective way of communicating with your customers.
Suitable for R&D automation, office, game development and teaching scenarios.

Project Address

https://github.com/baai-agents/cradle

July 13, 2025

Generating immersive historical stories with smart bodies at the touch of a button smells so good!
Hello, everyone, I'm Li Hua, an AI blogger with over 100,000 followers across the web, specializing in AI science and intelligent body sharing.

Recently I've been working on an intelligent body workflow for generating a series of short self-publishing videos with Coze, with the goal of generating a variety of explosive short videos with a single click.

Today we're going to share the Immersive Historical Storytelling Video Intelligence Body workflow, and we'll start with a case study.

This jitterbug account @strange history Only 56 entries, surprisingly up 480,000 fans, all of which are explosive.

After a few days of research, I finally got the logic and flow of it sorted out and developed the workflow.

All I had to do was enter a history-based topic in the workflow run box, click on the trial run, and soon a short pop-up history video was generated for me in one click.

For example, I typed in "darkness", clicked run, generated a short video draft, and then exported the draft in Cutting and Screening, and the following video was created.

Seeing how the demo works, how was this workflow created?

I. Development of ideas:
1. Generate copy topics, backgrounds, and image cues based on themes via big models
2. Mirror pictures based on cued word derivatives
3. Determination of the screen presentation through the image generation and drawing board module
4. Generate a timeline from the audio and create a draft cutscene
III. Detailed workflow analysis

1. Start node

2、Generate short video copy according to the theme

3, with the help of large model students into the mirror prompt words

4, batch processing to generate images, audio

5. Create a draft cutout and generate a corresponding timeline based on the audio

At this point, the workflow for immersive historical storytelling is developed.

Thank you for reading, if this article is helpful to you, please like to pay attention to Oh, I will continue to share good workflow tutorials.

This tool has been placed in our shared space, and there are dozens of smart body workflows for you to experience~!

Welcome to my AI Intelligentsia workspace!

You will acquire the rights and benefits:

Dozens of intelligent workflows are at your disposal when you enter the co-creation space, and more will be added all the time!

〇 Exclusive member group Q&A service, any questions in the use of intelligent body workflow can be

ã€€ã€€ã€€ã€€Members and friends can mention the needs of the body, the demand for more will be developed to put space for everyone to use!
July 13, 2025

AICG

The fiery cut fruit ASMR video on the whole network, it turns out to be done like this! (with AI generation tutorial + free channel)

✅ Step 1: Prepare the Prompt (Prompt)

🧠 Example prompt word 1 (good for getting started quickly):

🧠 Example Prompt Word 2 (Premium + Multiple Perspectives):

🔄 Quickly generate prompt words (AI help writing recommended):

✅ Step 2: Generate Video with Veo 3

📍 Method 1: Gemini official website (easy to use)

📍 Method 2: Google Labs Flow (customizable)

🎬 Final advice: don't just play, make accounts!