Jupiter Text Cleaner - Help & Documentation

Complete guide to AI-style pattern reduction, privacy features, and advanced usage

πŸ“‹ Overview

What is Jupiter Text Cleaner?

Jupiter Text Cleaner is a privacy-first text processing tool designed to reduce obvious AI fingerprints and improve readability. Unlike other tools, this application uses no AI itself - it's a rules-based system that applies deterministic pattern matching to clean and normalize text.

πŸ”’ Privacy-First Design

  • Client-Side Core Processing: Core cleaning happens in your browser
  • Optional External Grammar Asset: Grammar checking may load Harper.js when you enable it, then runs locally in the browser
  • No Text Collection by Design: The app is designed not to store or transmit pasted text for core cleaning
  • Open Source Transparency: All code is visible and auditable

🎯 Core Capabilities

  • AI Fingerprint Reduction: Targets 100+ common linguistic patterns often associated with machine-generated text
  • Style Naturalization: Applies practical rules for reducing overly formal, uniform, or machine-like phrasing
  • Prompt-Injection Flagging: Highlights suspicious instruction-like or script-like text so you can decide what to keep or remove
  • Compact Diagnostics: Shows quick counts for hidden/special characters, prompt/script risks, AI-style wording, and optional readability signals
  • Grammar & Spell Checking: Optional local grammar checking with Harper.js WebAssembly
  • Local TXT Import: Import a local .txt file in the browser for cleanup, subject to file type and size checks
  • Multiple Export Formats: Download as TXT, HTML, or a local JSON report that includes text, settings, and diagnostics
  • Privacy Controls: Choose between Enhanced (with grammar) or Ultra Privacy (text cleaning only) modes

πŸš€ What Makes This Different

  • AI-Style Cleanup: Focused on reducing obvious machine-like wording and formatting artifacts, not just general text cleaning
  • Research-Informed: Built around practical linguistic patterns commonly associated with AI-style writing
  • Privacy-First: Core cleaning runs in the browser, with optional grammar resources controlled by privacy settings
  • No Cloud AI Processing: Core cleanup is rules-based with optional browser-based grammar checking
  • Practical Workflows: Useful for business, academic, and creative editing where manual review remains important

πŸš€ Getting Started

Quick Start (3 Steps)

  1. Paste or Import Your Text: Paste text into the input area, or use "Import TXT" for a local plain-text file
  2. Select Options: Choose your preferred settings:
    • For beginners: Click "Restore Defaults" for a practical structure-safe cleanup preset
    • For stricter privacy: Enable "Ultra Privacy Mode"
    • For enhanced results: Enable grammar checking and advanced humanization
  3. Process & Export: Click "Run Cleaner" and choose your export method:
    • Copy cleaned text for immediate use
    • Download as TXT for documents
    • Download as HTML for formatted sharing
    • Download a local JSON report that includes settings, diagnostics, original text, and cleaned text
    • Use "Show Diff" when you want a simple before/after line comparison

πŸ”§ Advanced Features

  • Privacy Modes: Toggle between Enhanced (full features) and Ultra Privacy (text cleaning only)
  • Grammar Checking: Optional spell and grammar checking with local WebAssembly processing
  • Structure Preservation: Maintain bullet points, indentation, and paragraph breaks while cleaning
  • Character Revealing: See hidden characters and AI fingerprints with color-coded highlighting
  • Safety Diagnostics: Flag possible hidden instructions, prompt-injection attempts, and script-like payloads without deleting them by default
  • Readability Diagnostics: Optionally flag long sentences and very long words for manual review
  • Before/After Diff: Show a simple line comparison after cleanup to review what changed
  • Local TXT Import: Load plain-text files locally through the browser without changing the core browser-local cleanup model
  • AI Fingerprint Detection: Visual highlighting of AI overused words, formal phrases, and uncontracted grammar
  • Comprehensive Test Text: Load sample text demonstrating all cleaning capabilities
  • Undo Functionality: Restore original text if needed (hidden by default)
  • Consistent Settings: Saved preferences help you reuse the same cleanup approach across documents

πŸ” Enhanced Reveal Codes Feature

The "Reveal Codes" functionality has been enhanced to show both invisible characters AND AI fingerprint patterns with color-coded highlighting:

🎨 Color-Coded Highlighting

  • πŸ”΄ Red Highlights: AI overused words like "delve", "leverage", "synergy", "paradigm"
  • 🟠 Orange Highlights: Formal AI phrases like "furthermore", "in conclusion", "it is important to note that"
  • 🟒 Green Highlights: Uncontracted grammar with suggested contractions (e.g., "I am" β†’ "I'm")
  • Dark Red Highlights: Suspicious prompt-injection, hidden-instruction, or script-like text for review
  • 🟣 Purple Highlights: Invisible Unicode spaces ([ZWS], [NBSP])
  • πŸ”΄ Red Highlights: Bidirectional formatting and control characters
  • πŸ”΅ Blue Highlights: Smart quotes and dashes
  • 🩢 Grey Highlights: Common whitespace characters

πŸ§ͺ Comprehensive Test Text

Click "Load Test Text" to load a comprehensive sample that demonstrates:

  • Many common AI fingerprint patterns in natural sentence context
  • Real invisible Unicode characters included in the sample
  • Smart formatting and control characters
  • Transcript cleanup examples
  • Technical artifacts and international characters

πŸ’‘ Usage Tips

  • Before Cleaning: Use Reveal Codes to see what will be removed
  • After Cleaning: Use Reveal Codes to verify successful removal
  • Diagnostics First: Review the compact diagnostic counts before enabling any destructive removal options
  • Compare Carefully: Use Show Diff for a line-level before/after check when changes need closer review
  • Learning Tool: Hover over highlights to see detailed explanations
  • Quality Check: Ensure no unwanted characters remain in your text

πŸ“‹ Understanding the Options

Character Cleaning

  • Transcript Cleanup: Removes timestamps and stitches fragmented transcript lines
  • Common Whitespace: Handles newlines, tabs, and special Unicode spaces
  • Control Characters: Removes invisible formatting characters
  • Smart Formatting: Converts smart quotes, dashes, and other typographic elements

AI Styling Removal

  • Remove AI Overused Words: Eliminates 100+ AI-favored words and phrases
  • Humanize Sentence Structure: Creates natural sentence length variation
  • Add Controlled Flaws: Introduces natural contractions and imperfections
  • Reduce Formality: Replaces academic language with conversational alternatives

Preservation Options

  • Preserve Bullet Points: Protects bullet points and numbered lists
  • Preserve Indentation: Maintains tab and space indentation
  • Preserve Paragraph Breaks: Keeps double newlines for paragraph separation
  • Preserve Carriage Returns: Maintains Windows text file formatting

🧠 AI Humanization Technology (Updated March 2025)

Research-Based Humanization

Jupiter Text Cleaner incorporates research-based editing techniques to reduce obvious AI writing signals and improve human readability. Our approach is based on practical pattern analysis from common detector-facing writing traits and linguistic style markers.

πŸ“Š Advanced Linguistic Analysis

  • Perplexity & Burstiness: Introduces natural sentence length variation to mimic human writing patterns
  • Lexical Density Optimization: Balances formal and informal vocabulary to match human writing
  • Stylometric Marker Removal: Eliminates 100+ AI-favored words and phrases identified through linguistic analysis
  • Sentence Structure Variation: Breaks uniform AI sentence patterns into natural, varied structures
  • Writing Style Normalization: Adjusts tone, formality, and complexity to human levels

🎯 Targeted AI Patterns

Based on common public discussions of AI-style writing patterns, the tool can target:

Overused AI Words

These words appear disproportionately in AI-generated text:

  • Business/Corporate: "delve", "foster", "elucidate", "leverage", "comprehensive", "robust", "seamless"
  • Academic/Formal: "innovative", "paradigm", "synergy", "holistic", "ecosystem", "tapestry"
  • Descriptive: "profound", "meticulous", "nuanced", "intricate", "embark", "navigate", "illuminate"
  • Action Verbs: "harness", "empower", "transcend", "revolutionize", "elevate", "enhance"
Formal Language Patterns

AI models tend to use overly formal language:

  • Transitions: "furthermore", "moreover", "consequently", "therefore", "in conclusion"
  • Introductions: "it is important to note that", "it should be noted that", "it is worth mentioning"
  • Closings: "in summary", "to summarize", "in closing", "to wrap up"
Structural Uniformity

AI writing often shows unnatural consistency:

  • Sentence Length: Uniform 12-18 word sentences (low burstiness)
  • Grammar: Overly polished wording with few contractions or natural variations
  • Punctuation: Overly formal punctuation patterns
  • Formatting: Consistent paragraph structure and organization

πŸ”¬ Research-Based Techniques

  • Controlled Imperfections: Adds natural contractions and occasional sentence fragments
  • Formality Reduction: Replaces academic language with conversational alternatives
  • Burstiness Enhancement: Creates natural variation in sentence complexity and length
  • Transition Word Optimization: Reduces overused transition words that signal AI writing
  • Vocabulary Balancing: Mixes formal and informal vocabulary appropriately
  • Sentence Fragmentation: Occasional short sentences for natural rhythm

πŸ“ˆ Practical Quality Signals

  • Predictability Reduction: Helps reduce some formulaic wording patterns
  • Rhythm Variation: Encourages more varied sentence length and cadence where selected
  • Visible Artifact Cleanup: Removes hidden characters and obvious formatting issues
  • Readability Improvement: Helps produce cleaner text for manual review

πŸ›‘οΈ Privacy & Security Details

πŸ”’ Privacy Design

  • Local Core Cleaning: Core cleaning is designed to run in your browser
  • Optional Grammar Loading: Grammar checking may load a Harper.js browser asset when enabled
  • No Account Required: No registration, login, or personal information needed
  • No Cookies for Tracking: Only uses localStorage for your preferences
  • Open Source Code: Full transparency - you can verify every operation
  • No Text Telemetry by Design: Pasted text is not intentionally collected for analytics or training
  • User Control: Ultra Privacy mode disables optional grammar loading

πŸ›‘οΈ Security Features

  • Content Security Policy: CSP restricts external resource loading to approved sources
  • Input Validation: 1MB text limit prevents denial-of-service attacks
  • Rate Limiting: 60 operations per minute prevent abuse
  • Bot Detection: Identifies and throttles automated behavior
  • Memory Management: Automatic cleanup prevents memory leaks
  • XSS Protection: Proper HTML escaping prevents injection attacks
  • HTTPS Enforcement: Secure connection when available

πŸ“‹ Privacy Modes Explained

  • Enhanced Mode (Default): Core cleanup plus optional grammar checking (one-time browser asset download)
  • Ultra Privacy Mode: Text cleaning only; optional grammar resources are disabled
  • Browser-Based Grammar: Grammar checking is intended to run locally after its optional asset loads
  • Customizable: Users control exactly which features to enable

πŸ” Technical Security Details

  • WebAssembly Sandbox: Grammar checker runs in isolated WebAssembly environment
  • No External APIs: No communication with external servers for text processing
  • Local Storage Only: Preferences stored locally in browser
  • No Third-Party Scripts: Only self-hosted JavaScript and optional Harper.js CDN
  • No Network Calls: All processing happens offline after initial page load

🎨 Use Cases & Workflows

πŸ“ Content Creation

  • Blog Posts & Articles: Make AI-assisted drafts read more naturally and closer to your own voice
  • Academic Papers: Remove AI markers while maintaining professional tone and credibility
  • Marketing Copy: Humanize AI-generated marketing materials for better engagement
  • Social Media Content: Create more authentic-feeling posts from AI drafts with a cleaner, less robotic tone
  • Creative Writing: Transform AI-generated stories into publishable content
  • Technical Documentation: Clean AI-generated docs while maintaining accuracy

πŸ’Ό Business Applications

  • Report Writing: Reduce obvious machine-like wording in business reports and analyses
  • Email Communications: Smooth out overly formal or uniform AI-assisted drafts
  • Internal Communications: Improve readability in AI-assisted memos and announcements
  • Content Marketing: Humanize bulk content for better reader engagement
  • Legal Documents: Clean AI-generated legal text while maintaining precision
  • Training Materials: Create natural-feeling educational content from AI drafts

πŸ”§ Development Workflows

  • App Integration: Use JSON export for custom application development
  • Repeatable Settings: Reuse saved cleanup preferences for consistent document handling
  • API Development: Build privacy-first text processing services and APIs
  • Quality Assurance: Run a practical edit pass for tone, readability, and obvious AI markers before publishing
  • Content Management: Integrate into CMS workflows for automated content cleaning
  • SEO Optimization: Improve clarity and natural phrasing for better reader engagement and retention

βœ… Real-World Limits & Best Practice

This software is strongest at removing obvious machine-like patterns and hidden formatting artifacts. It is not a guarantee against evolving AI detection systems, and no tool can promise universal bypass results.

  • What it does well: Hidden character cleanup, formal phrase reduction, overused AI word cleanup, and readability normalization.
  • What it may miss: Deep narrative consistency signals, domain-specific phrasing tells, and newly emerging detector heuristics.
  • When auto-fixes can hurt: Some replacements can flatten voice, remove useful precision, or make text feel unnatural in technical/legal contexts.
  • Best practice: Treat outputs as a strong first draft, then manually review tone, facts, and voice alignment before publishing.

🧩 Recommended Workflow (Prompt + Structure-Safe Mode + Manual Review)

  1. Generate your first draft with a high-quality prompt (examples below).
  2. Paste into Jupiter Text Cleaner and run Structure-Safe mode first.
  3. Read aloud and manually adjust rhythm, specificity, and sentence variety.
  4. Use Full Cleanup options only when needed for heavier rewriting.
  5. Final check: verify facts, citations, and audience fit.

Tip: For light edits, use Structure-Safe mode directly. For stronger rewriting, use the advanced engineering prompt below, then run Jupiter Text Cleaner as a final cleanup pass.

βš™οΈ Advanced Option: Humanization Engineering Prompt (OpenAI / Claude)

If you want a stronger first draft before using Jupiter Text Cleaner, use this advanced prompt in OpenAI or Claude.

Engineering Prompt (Advanced)

You are a senior editor rewriting text to sound naturally human while preserving meaning.

Context:
- Audience: [TARGET AUDIENCE]
- Tone: [CONVERSATIONAL / PROFESSIONAL / TECHNICAL]
- Region style: [US / UK / OTHER]

Hard constraints:
1) Preserve factual meaning exactly. Do not invent facts, citations, names, or numbers.
2) Keep key domain terms where precision matters.
3) Return only the rewritten text.

Humanization objectives:
- Increase sentence-level burstiness: mix short, medium, and long sentences.
- Reduce low-perplexity phrasing: avoid predictable generic wording when better natural wording exists.
- Replace formulaic transitions ("furthermore", "moreover", "in conclusion") with natural flow.
- Reduce concept looping and repeated phrasing.
- Add natural cadence: occasional contractions, brief fragments, or short emphasis lines.
- Use specific, concrete wording over abstract filler.
- Keep paragraph structure varied (not rigid topic-sentence every time).
- Keep readability high and avoid awkward synonym swaps.

Quality checks before final output:
- Does it read like a knowledgeable human wrote it?
- Does each paragraph vary rhythm and structure?
- Are facts unchanged and verifiable?

Text to rewrite:
[PASTE TEXT]

Claude Format (same rules)

<instructions>
Apply the same constraints and objectives above.
Preserve meaning exactly, improve natural human cadence, and return only revised text.
</instructions>

<text>
[PASTE TEXT]
</text>

⚠️ Humanization Disclaimer (Important)

  • No guarantee of detector outcomes: No prompt, model, or cleaning workflow can guarantee bypass of AI detection systems.
  • Detectors evolve continuously: Results vary by detector, domain, writing quality, and policy changes.
  • Use ethically and lawfully: Follow your school, workplace, and platform policies when using AI-assisted writing.
  • You remain responsible: Always verify facts, citations, and final wording before publication or submission.
  • Best use case: Treat these techniques as readability and style improvement methods, not a compliance or evasion mechanism.

At the time of writing, many users report that Claude often produces very natural cadence when prompted with clear style constraints. OpenAI models can produce similarly strong results with explicit structure and tone instructions.

πŸŽ“ Creative Industries

  • Publishing: Clean AI-generated manuscripts for publication
  • Journalism: Remove AI markers from AI-assisted news articles
  • Screenwriting: Humanize AI-generated scripts and dialogue
  • Copywriting: Transform AI ad copy into natural-sounding marketing text
  • Translation: Clean machine-translated text to appear human-written

❓ Frequently Asked Questions

Q: Does this use AI to process text?

A: Core cleanup is rules-based and does not use cloud AI processing. The optional grammar checker uses a browser-based WebAssembly asset, not a cloud AI service.

Q: Is my text private and secure?

A: Core cleaning is designed to happen in your browser without uploading pasted text. Optional grammar checking may load a browser-based Harper.js asset when enabled. For highly sensitive material, use Ultra Privacy mode and review your browser/extension environment.

Q: What makes this different from other text cleaners?

A: We focus on AI-style pattern reduction using practical, rules-based editing patterns. The tool targets common linguistic markers associated with machine-like writing, while core cleanup stays in the browser.

Q: What AI-style wording can Jupiter Text Cleaner help reduce?

A: It can reduce common AI-style wording patterns such as formal filler, repeated transition phrases, overused words, overly polished phrasing, repetitive sentence rhythm, and hidden formatting artifacts often found in copied or AI-assisted drafts. It is rules-based, so it helps with review and cleanup rather than guaranteeing a specific detector result.

Q: Does this remove non-obvious AI text artifacts?

A: It can help with less visible issues such as zero-width spaces, non-breaking spaces, Unicode formatting marks, smart punctuation artifacts, repeated wording, and overly uniform constructions. These issues are useful to clean because they can affect readability, formatting, copying, and downstream review.

Q: What are prompt/script risk diagnostics?

A: Prompt/script risk diagnostics are rule-based checks for suspicious instruction overrides, hidden-instruction phrases, prompt metadata, script-like tags, browser payloads, encoded content, and possible data-exfiltration instructions. The app counts and highlights findings for review; removal of entire flagged lines is optional and controlled by you.

Q: Does prompt-injection screening guarantee that text is safe?

A: No. It is a practical warning system, not a security scanner or malware sandbox. It catches many common patterns, but unfamiliar, obfuscated, multilingual, fragmented, or context-dependent instructions may be missed. Legitimate technical examples can also be flagged. Treat findings as reasons to inspect the original text, and do not paste untrusted content into sensitive systems solely because this tool reports zero risks.

Q: How accurate is the grammar checker?

A: The optional Harper.js checker catches many common spelling, agreement, punctuation, and style issues within the limits of a free browser-based tool. It may miss valid problems, misunderstand names or specialist terminology, and occasionally suggest a change that does not fit the intended meaning. Review suggestions before applying them and proofread important documents manually.

Q: How does the transcript cleaner help?

A: Transcript cleanup removes common transcript clutter such as timestamps, SRT/VTT timestamp ranges, numeric cue lines, and broken continuation lines. It is intended to make spoken text easier to read while preserving the actual content.

Q: Can I use this for commercial purposes?

A: Yes! The tool is free to use for any purpose. The JSON export is particularly useful for integrating into commercial applications and workflows.

Q: How effective is the AI-style pattern reduction?

A: It is effective for many common, obvious AI-style markers. However, detector systems evolve continuously, so results vary by tool, document type, and writing domain. Always do a manual review before final use.

Q: Can this tool guarantee text will pass AI detection?

A: No. This tool improves text quality and reduces obvious machine-like patterns, but it cannot guarantee detector outcomes. Use it as part of a broader workflow that includes prompt quality, fact checking, and human editing.

Q: What's the difference between Enhanced and Ultra Privacy modes?

A: Enhanced mode includes optional grammar checking that may download a Harper.js WebAssembly/browser asset. Ultra Privacy mode disables that optional grammar resource path.

Q: Does the grammar checker send my text to servers?

A: Harper.js is intended to run locally in your browser after its asset loads. The app does not intentionally send your text to a grammar-checking server.

Q: Can I customize the cleaning rules?

A: Yes! You can enable/disable individual cleaning options and save your preferences for future use. The tool is highly customizable.

Q: What file formats can I download?

A: You can download as plain text (.txt), formatted HTML (.html), or a JSON report (.json). The JSON report includes processing details, settings, diagnostics, original text, and cleaned text.

Q: Can I import a file?

A: Yes. Use "Import TXT" to load a local plain-text file into the input area. The app checks file type, empty files, and size before importing.

Q: What do the diagnostics mean?

A: Diagnostics are rule-based signals for review. They count hidden/special characters, possible prompt/script risks, AI-style wording, and optional readability items such as long sentences or very long words. They do not guarantee quality, safety, or detector outcomes.

Q: What does Show Diff do?

A: Show Diff displays a simple before/after line comparison after cleanup. It is meant as a quick review aid, not a full document redline system.

Q: Is there a limit on text size?

A: Yes, there's a 1MB limit per text to ensure good performance and prevent abuse. Most documents are well under this limit.

Q: How do I report bugs or request features?

A: Contact us at contact@jupitersbusiness.com with details about the issue or feature request.

πŸ”§ Technical Details

πŸ—οΈ Architecture

  • Frontend: Pure HTML5, CSS3, and JavaScript (no frameworks)
  • Processing: Client-side rules-based text processing
  • TXT Import: Browser FileReader-based local import for plain-text files
  • Grammar: Optional Harper.js WebAssembly module
  • Storage: Browser localStorage for preferences only
  • No Backend for Core Cleanup: The main cleaning workflow runs client-side

πŸ”§ Technologies Used

  • HTML5: Semantic markup and accessibility features
  • CSS3: Modern styling with CSS variables and responsive design
  • JavaScript ES6+: Modern JavaScript with async/await and modules
  • WebAssembly: Harper.js grammar checker (optional)
  • Unicode Support: Comprehensive Unicode character mapping
  • Content Security Policy: Strict CSP for security

πŸ“Š Performance Characteristics

  • Text Cleaning: Usually fast for typical pasted documents
  • Diagnostics: Rule-based counts update in the browser and may vary with document size and selected options
  • Grammar Checking: Depends on text size, browser, and whether the optional asset has loaded
  • Memory Usage: Varies by document size and browser
  • Rate Limit: 60 operations per minute
  • Browser Support: Intended for typical modern browsers

πŸ” Character Mapping

The tool includes comprehensive mapping for:

  • Invisible Characters: Zero-width spaces, bidirectional formatting
  • Unicode Spaces: Various Unicode space characters
  • Control Characters: C0 and C1 control ranges
  • Formatting Characters: Smart quotes, dashes, mathematical symbols
  • AI Markers: Specific patterns used by AI models

πŸ›‘οΈ Security Measures

  • Input Validation: Size limits and type checking
  • Safety Diagnostics: Rule-based flagging for suspicious prompt-injection or script-like text; flagged content is not removed unless you enable removal
  • TXT Import Checks: Plain-text type checks, empty-file handling, and size validation before loading file contents
  • Output Sanitization: HTML escaping for downloads
  • Memory Management: Automatic cleanup and leak prevention
  • Rate Limiting: Prevents abuse and DoS attacks
  • Bot Detection: Identifies automated behavior
  • CSP Headers: Strict content security policy

πŸ“ File Structure

  • index.html: Main application interface
  • script.js: Core text processing logic (2,000+ lines)
  • style.css: Styling and responsive design
  • ai_fingerprint_criteria.txt: Research data and patterns
  • help.html: This comprehensive help documentation

πŸ’¬ Support & Contact

πŸ“§ Get Help

πŸ”— Resources

πŸ“š Documentation

  • Audit Report: Comprehensive security and functionality audit
  • AI-Style Cleanup Notes: Practical pattern-reduction techniques
  • Technical Specs: Detailed technical documentation
  • Update Log: Version history and changes

πŸ”„ Updates & Maintenance

  • Updates: Pattern updates and improvements can be added as the project evolves
  • Security Fixes: Reported vulnerabilities should be reviewed and prioritized
  • Feature Requests: Suggestions can guide future development
  • Bug Reports: Clear reproduction steps help prioritize fixes

⭐ Contributing

We welcome contributions and feedback! This is an open-source project focused on privacy-first text processing. If you have suggestions for improvements or want to contribute code, please reach out through the contact information above.