Core Translation
4 services
Text Translation
Instant AI-powered text translation across 100+ languages
Our core translation engine processes plain text using a multi-layered AI pipeline that combines neural machine translation with context-aware language models. The engine detects the source language automatically, understands domain-specific vocabulary, and produces natural-sounding output in the target language.
How It Works
- 1Paste or type any text — source language is detected automatically
- 2Our AI engine tokenises and analyses the input for context, tone, and domain
- 3Neural machine translation generates a draft, then a post-editing model refines it
- 4The final translation is returned with a confidence score and alternative suggestions
Use Cases
- Translating documents, emails, and messages on the fly
- Understanding foreign-language web content
- Professional translation with domain-specific accuracy (legal, medical, technical)
- Side-by-side bilingual reading
Technical Details
- Supports 100+ languages with ISO 639-1 codes
- Context window up to 8,000 tokens per request
- Sub-second response time for texts under 500 words
- Post-translation quality scoring using BLEU and COMET metrics
Voice Translation
Record speech, get translated text and audio instantly
Record audio directly through your microphone or upload a pre-recorded clip. The voice translation service transcribes the speech with high accuracy using deep-learning ASR (Automatic Speech Recognition), translates the transcript, and optionally synthesises the result back into speech in the target language.
How It Works
- 1Start recording or upload an audio file (MP3, WAV, OGG, M4A)
- 2Whisper-class ASR model transcribes speech with punctuation and speaker diarisation
- 3The transcript is passed through the core translation engine
- 4Text-to-Speech synthesis produces a natural voice output in the target language
Use Cases
- Translating voice messages and audio notes
- Accessibility for users who prefer speaking over typing
- Field research — recording interviews and getting instant translations
- Language learning by listening to correct pronunciation
Technical Details
- ASR model: Whisper-large-v3 equivalent accuracy
- Supports recordings up to 25 MB or 30 minutes
- Speaker diarisation identifies up to 8 distinct speakers
- TTS synthesis in 40+ language-accent pairs
Real-Time Conversation
Real-TimeLive streaming translation with sub-500 ms latency
Experience translation as you speak. The real-time mode uses WebSocket streaming to capture microphone input, run incremental ASR, and stream translated tokens back to the screen — all with end-to-end latency under 500 milliseconds. Ideal for live conversations, presentations, and broadcasts.
How It Works
- 1A persistent WebSocket connection is established when you enter this mode
- 2Your microphone audio is streamed in 100 ms chunks to the ASR service
- 3Partial transcripts are translated incrementally and streamed to the UI
- 4Final, corrected translations replace partial results as speech completes
Use Cases
- Live international meetings where everyone speaks their own language
- Conference presentations with real-time subtitle overlays
- One-on-one conversations with speakers of foreign languages
- Broadcast captioning for multilingual audiences
Technical Details
- WebSocket protocol with binary audio frame transport
- Incremental beam-search decoding for streaming ASR
- Target latency: < 500 ms from speech to translated text
- Automatic punctuation and sentence segmentation in the stream
Live Dialogue Mode
Two-way face-to-face translation between any two languages
Designed for face-to-face interactions, Live Dialogue Mode allows two people speaking different languages to have a natural conversation. Each person speaks into the device in turn; the app detects which language is being spoken, transcribes it, and plays the translation aloud so the other person can hear and respond.
How It Works
- 1Select the two languages in use (or enable auto-detect for both)
- 2Person A speaks — the app detects the language and translates to Language B
- 3The translated audio plays through the speaker for Person B to hear
- 4Person B replies and the flow reverses automatically
Use Cases
- Travellers talking to locals in hotels, restaurants, and markets
- Healthcare professionals consulting patients who speak different languages
- Border and immigration interactions
- Business negotiations between international partners
Technical Details
- Dual-channel VAD (Voice Activity Detection) for speaker turn detection
- Language-pair cache to maintain consistent style across a session
- Offline fallback mode with a compressed on-device model for 20 common languages
- Session history saved locally for review
Media & Documents
5 services
Meeting Translation
Transcribe and translate video-call meetings in real time
Connect Alsun to your video conference (Zoom, Teams, Google Meet) via browser extension or system audio capture. The meeting translation service produces a live multilingual transcript, translates each speaker's segment into your chosen language, and provides a searchable post-meeting summary.
How It Works
- 1Capture system audio or connect via the Alsun browser extension
- 2Each speaker is identified using diarisation; their segment is transcribed
- 3Segments are translated to all selected target languages simultaneously
- 4A structured transcript is saved and available for download after the meeting
Use Cases
- Global remote teams holding cross-language standups and reviews
- International client calls where some participants lack a shared language
- Webinars with multilingual audiences requiring live captioning
- Post-meeting review with translated and searchable minutes
Technical Details
- System audio capture via Web Audio API or native extension hook
- Speaker diarisation up to 15 concurrent speakers
- Translation into up to 5 target languages simultaneously per session
- Meeting transcripts exportable as DOCX, SRT, or JSON
Document & Photo Translation
Upload images or PDFs and get full translated text
Upload any document — scanned PDFs, photos of printed text, screenshots, or image files — and Alsun will extract the text using OCR (Optical Character Recognition) and translate it while preserving the original layout as closely as possible. Supports complex layouts including multi-column newspapers, tables, and mixed-script documents.
How It Works
- 1Upload a PDF, image (JPG/PNG/WEBP), or take a photo with your camera
- 2Layout analysis segments the document into text blocks, tables, and headers
- 3OCR extracts text from each region with positional metadata
- 4The translated text is overlaid on the original layout and available for download
Use Cases
- Translating scanned contracts, certificates, and official documents
- Reading foreign-language books, newspapers, and brochures
- Translating product packaging and instructions while travelling
- Converting handwritten notes to typed, translated text
Technical Details
- OCR engine supports Latin, Arabic, CJK, Cyrillic, Devanagari, and 30+ scripts
- Table structure recognition preserves rows and columns through translation
- PDF reconstruction with translated text in original font size regions
- Maximum file size: 50 MB; up to 200 pages per document
Live Media Translation
Translate audio from live streams, videos, and broadcasts
Point Alsun at any audio source — a live TV broadcast, a streaming video, or a YouTube clip — and receive a real-time translated subtitle overlay. The service uses adaptive buffering to balance latency against accuracy, making it suitable for both live events and pre-recorded video review.
How It Works
- 1Select the media source: system audio, browser tab, or URL
- 2Audio is buffered in sliding 3-second windows for stable ASR
- 3Transcripts are translated and timed to match the audio playback
- 4Subtitles render as an overlay on the original video
Use Cases
- Watching foreign-language films and TV shows with instant subtitles
- Following live international news broadcasts
- Reviewing conference recordings in a different language
- Language learners shadowing native-speaker video content
Technical Details
- Browser tab capture via Screen Capture API
- Adaptive buffer: 3 s default, configurable 1–10 s
- SRT and VTT subtitle export for use in media players
- Supports 4K video playback without performance degradation
Podcast Translation
Translate entire podcast episodes with speaker attribution
Upload a podcast audio file or provide an RSS feed URL. Alsun transcribes each episode with speaker attribution, translates the full transcript, and generates a translated audio version using TTS synthesis. The result is a bilingual podcast player where you can follow along in any language.
How It Works
- 1Upload an MP3/WAV file or paste an RSS feed or episode URL
- 2The audio is transcribed with timestamps and speaker labels
- 3Each speaker segment is translated while maintaining conversational flow
- 4TTS synthesis generates the translated audio track with matched speaker voices
Use Cases
- Listening to educational podcasts in your native language
- Podcast producers reaching global audiences with auto-translated feeds
- Language learners following podcasts in their target language
- Researchers analysing cross-language media content
Technical Details
- Episode lengths up to 4 hours supported
- Speaker diarisation tags up to 6 hosts/guests
- TTS synthesis with matched vocal style per speaker
- Output: bilingual transcript PDF + translated MP3 audio
E-Book Translation
Translate e-books and long-form text page by page
The e-book translator allows you to paste or upload large blocks of text representing book chapters or pages and receive chapter-by-chapter translations. Each translation is saved to your account for offline review, and the service maintains character and terminology consistency across the entire book using a translation memory.
How It Works
- 1Paste chapter text or upload a plain-text or EPUB file
- 2A translation memory is built from the first chapter to ensure consistent terminology
- 3Each page or chapter is translated while referencing the memory
- 4Translated pages are stored in your library for review and re-download
Use Cases
- Reading foreign-language novels in your preferred language
- Publishers preparing bilingual editions
- Students accessing academic texts in translation
- Writers proofreading their own work in a second language
Technical Details
- Translation memory (TM) per book for terminology consistency
- Batch processing: entire books in the background with notification on completion
- EPUB and plain-text (.txt) input supported
- Output: translated EPUB + side-by-side bilingual PDF
Learning & Study
3 services
Language Learning
Structured lessons powered by spaced-repetition AI
The Language Learning module combines AI-generated lessons with spaced-repetition scheduling so you retain vocabulary and grammar over time. Progress is tracked per skill and language, and the AI adapts the difficulty based on your performance — giving you more of what you need and less of what you have already mastered.
How It Works
- 1Choose a target language and current proficiency level (A1–C2)
- 2The AI creates a personalised curriculum of vocabulary, grammar, and conversation exercises
- 3Spaced-repetition scheduling surfaces review cards at optimal intervals
- 4Speaking exercises are graded by the ASR pronunciation scorer
Use Cases
- Beginners building a vocabulary foundation systematically
- Intermediate learners strengthening grammar weak points
- Professionals preparing for language certifications
- Travellers learning essential phrases before a trip
Technical Details
- SM-2 algorithm variant for spaced-repetition scheduling
- Per-skill mastery tracking stored in your user profile
- Pronunciation scoring using phoneme-level alignment
- Curriculum covers CEFR levels A1 through C2
AI Language Tutor
Conversational AI tutor for guided language practice
The AI Language Tutor is an interactive conversational partner that holds structured lessons, corrects your mistakes in real time, and explains grammar rules in plain language. It adapts to your learning goals and provides immediate feedback on vocabulary choice, sentence structure, and pronunciation.
How It Works
- 1Start a session by telling the tutor your goal (e.g. 'practise past tense in Spanish')
- 2The tutor guides you through exercises, asks follow-up questions, and monitors accuracy
- 3Errors are corrected inline with explanations in your native language
- 4Session summaries show what you practised and areas to revisit
Use Cases
- One-on-one tutoring without scheduling a human teacher
- Conversational practice for students preparing for oral exams
- Business professionals polishing formal register in a second language
- Children learning a heritage language at home
Technical Details
- Powered by a fine-tuned LLM trained on language pedagogy data
- Grammatical error classification across 12 error categories
- Native-language explanations available in 20+ languages
- Session history and mistake log stored in your profile
Auto-Translate Overlay
Hover-translate any text on your screen instantly
The browser extension adds an invisible translation layer to any web page. Hover over any paragraph, tooltip, or UI element to see an instant translation pop-up. No copy-pasting required — the overlay intercepts mouse events and translates the hovered node's text content on demand.
How It Works
- 1Install the Alsun browser extension (Chrome, Firefox, Edge)
- 2On any web page, hover over text and press the trigger key (default: Alt)
- 3The extension sends the text to the Alsun API and shows the translation in a floating tooltip
- 4Click to copy the translation or open it in the full Alsun editor
Use Cases
- Reading foreign-language research papers and articles without switching tabs
- Navigating foreign-language software interfaces and menus
- Social media users reading posts in other languages inline
- Online shoppers checking product descriptions on international stores
Technical Details
- MutationObserver detects dynamic page content changes
- Debounced hover events to avoid redundant API calls
- Translation cache per page session to minimise latency on repeated hovers
- Compatible with Chrome 112+, Firefox 120+, Edge 112+
Travel & Tourism
5 services
Tourism Search
Translate tourism queries and find local information worldwide
Tourism Search translates your travel queries and searches local tourism databases to surface relevant information — hotels, attractions, restaurants, and services — with descriptions in your language. It understands natural-language questions like 'best traditional food near me in Cairo' and returns structured, translated results.
How It Works
- 1Enter a natural-language tourism query in any language
- 2The query is translated to the local language for database search
- 3Results from tourism APIs and local directories are fetched and returned
- 4Listings are translated back to your language with key details highlighted
Use Cases
- Researching destinations before travel
- Finding local services when you don't speak the language
- Comparing hotels and attractions with translated reviews
- Generating itineraries based on interests and location
Technical Details
- Integrates with global tourism data providers and local directories
- Natural-language query understanding in 40+ languages
- Geolocation-aware search for proximity-ranked results
- Structured output: name, category, address, rating, translated description
Tourism Photo Recognition
Identify landmarks, hotels, and places from a photo
Take a photo of any landmark, building, monument, or sign, and the Photo Recognition service will identify it, provide historical and contextual information in your language, and link to related tourism resources. Powered by a vision model trained on millions of geotagged images worldwide.
How It Works
- 1Upload a photo or take one directly with your device camera
- 2The vision model classifies the image against a database of landmarks worldwide
- 3Historical, cultural, and practical information is retrieved and translated
- 4The result shows location name, country, description, and a confidence score
Use Cases
- Identifying unfamiliar buildings and monuments while sightseeing
- Learning the history of a place you are visiting in real time
- Cross-checking photos taken on previous trips
- Travel bloggers enriching posts with accurate location data
Technical Details
- Vision model fine-tuned on 10M+ labelled landmark images
- Returns confidence score between 0 and 1 for each match
- Historical data sourced from curated tourism knowledge bases
- Supports JPG, PNG, WEBP; max 10 MB per image
Tourism Reviews
Share and read traveller reviews in any language
The Tourism Reviews module enables travellers to leave reviews of landmarks, hotels, restaurants, and services in their native language, with automatic translation so that all reviews are readable by anyone in any language. This creates a truly multilingual review ecosystem for global travellers.
How It Works
- 1Write a review in your own language — no translation required
- 2The review is stored and automatically translated into 10 major languages
- 3Readers see reviews in their chosen language regardless of the original language
- 4Star ratings, photos, and structured data are preserved across translations
Use Cases
- Travellers sharing authentic experiences with a global audience
- Tourism operators gathering multilingual feedback from international guests
- Tourists reading honest reviews without the language barrier
- Travel platforms building a multilingual review corpus
Technical Details
- Reviews stored in Supabase with full-text search in translated languages
- On-write translation pipeline processes all 10 target languages asynchronously
- Rating aggregation and sentiment analysis across all translated versions
- Photo URL storage and display with localised alt text
Itinerary Builder
AI-generated travel itineraries in your language
Describe your trip — destination, duration, interests, and budget category — and the Itinerary Builder will generate a day-by-day travel plan in your language, complete with time slots, transport suggestions, local dining recommendations, and cultural notes for each activity.
How It Works
- 1Enter your destination, travel dates, interests, and group size
- 2The AI cross-references tourism data, seasonal events, and opening hours
- 3A structured itinerary is generated with morning, afternoon, and evening slots
- 4Each activity includes a translated description, address, and practical tips
Use Cases
- First-time travellers planning a complete trip without a travel agent
- Groups with mixed interests who need a balanced schedule
- Spontaneous travellers who want a plan with a day's notice
- Travel bloggers generating draft itineraries for editorial review
Technical Details
- Real-time data from tourism APIs for opening hours and current events
- Itinerary stored in your account for editing and sharing
- Export as PDF, shareable link, or calendar (.ics) format
- Adjusts for local public holidays and seasonal closures
Travel Mode
Compact offline-capable translation for active travel
Travel Mode is a streamlined, mobile-optimised interface designed for use while on the move. It downloads a compact on-device model for your selected language pair so that core translation works even without an internet connection — essential for remote areas, flights, and international data roaming situations.
How It Works
- 1Select up to 3 language pairs to download for offline use
- 2The compact on-device model is installed (50–150 MB per language pair)
- 3When offline, translations run locally on your device with no network required
- 4When online, the full cloud model is used for higher accuracy
Use Cases
- Travelling in areas with limited or no mobile data
- Using translation on long-haul flights in aeroplane mode
- Remote expeditions and fieldwork in areas without reliable connectivity
- Avoiding international data roaming charges
Technical Details
- On-device model: distilled transformer, 50–150 MB per language pair
- Automatic fallback: online → offline seamlessly
- Offline supports text and camera translation; voice requires online connectivity
- Model updates downloaded over Wi-Fi automatically
Special Features
4 services
Speaker Mode
Large-display translation for presentations and classes
Speaker Mode presents translated text in large, high-contrast typography optimised for a second screen or projector. As the speaker talks, the translated text auto-scrolls for the audience to follow. Ideal for classroom teaching, public lectures, and conference keynotes with multilingual audiences.
How It Works
- 1Connect a second screen or start fullscreen mode
- 2The speaker talks into the microphone as in real-time mode
- 3Translated text appears in large font on the display screen
- 4Audience members can follow along in their chosen language
Use Cases
- University lecturers teaching international students
- Conference keynote speakers reaching multilingual audiences
- Church services and community events with diverse congregations
- Government briefings broadcast to multilingual populations
Technical Details
- Presentation display API for dual-screen output
- Configurable font size (24pt–96pt) and contrast themes
- Up to 3 simultaneous target languages in split-screen layout
- Presenter notes remain private on the control screen
AI Support Chat
Multilingual AI assistant for help and guidance
The AI Support Chat provides round-the-clock assistance in your language. It can answer questions about the app, guide you through features, help troubleshoot issues, and provide translation-related advice. The agent understands over 50 languages and always responds in the language you write in.
How It Works
- 1Open the chat panel and type your question in any language
- 2The AI identifies your language and responds in kind
- 3For feature guidance, it can open the relevant tool directly
- 4Complex issues are escalated with a session log to human support
Use Cases
- New users learning how to use specific features
- Getting instant answers about supported languages and capabilities
- Troubleshooting translation quality issues
- General queries about accounts, credits, and data privacy
Technical Details
- Powered by a fine-tuned multilingual support LLM
- Intent classification routes queries to the correct knowledge base
- Session context window: 20 conversation turns
- Human escalation with full session log handover
Emergency Voice Mode
SafetyOne-tap SOS translation for urgent situations
Emergency Voice Mode provides a single large button that immediately activates translation of critical emergency phrases. The screen displays large-text translations of key safety messages ('Call an ambulance', 'I need help', 'I am lost'), and the microphone listens for distress words to trigger an automatic alert.
How It Works
- 1Tap the large SOS button to enter emergency mode instantly
- 2Pre-loaded critical phrases are displayed in the selected language
- 3The microphone continuously listens for distress keywords
- 4Emergency contact numbers for the detected or selected country are shown
Use Cases
- Travellers in distress who cannot communicate with local emergency services
- Medical emergencies where the patient does not speak the local language
- Natural disasters and evacuations with multilingual affected populations
- Safety-critical field work in foreign countries
Technical Details
- Offline emergency phrase set pre-loaded at installation (10 KB)
- Distress keyword detection using a lightweight on-device classifier
- Emergency number database for 195 countries with category breakdown
- Screen stays on and maximum brightness auto-set in this mode
Emergency Numbers Directory
Verified emergency contacts for every country
A comprehensive, offline-capable directory of emergency telephone numbers for police, ambulance, fire, and specialist services in every country. Numbers are categorised by service type and verified against official national databases. The directory works without an internet connection once loaded.
How It Works
- 1Select a country from the dropdown or allow location access for auto-detection
- 2All emergency numbers for that country are displayed, categorised by service type
- 3Tap any number to initiate a call directly from the app
- 4Copy any number to clipboard with a single tap
Use Cases
- Travellers who need emergency numbers immediately in an unfamiliar country
- Travel safety apps and guides embedding the directory
- Parents teaching children which number to call in different countries
- Corporate travel teams providing duty-of-care resources
Technical Details
- Directory covers 195 countries with ISO 3166-1 alpha-2 codes
- Service types: Police, Ambulance, Fire, Medical, Coastguard, Mountain Rescue
- Offline-first: full directory cached on first load
- Numbers verified against official government sources quarterly
Ready to break the language barrier?
All services are available through the Alsun web app, mobile app, and browser extension. Start with text translation — no sign-up required.