Welcome to the new packages.gentoo.org! Please let us know about any issues you encounter and share your feedback here.


The app-text category contains tools for working with human-language text files.

All packages

7plus An encoder for packet radio
a2ps Any to PostScript filter
acroread Adobe's PDF reader
active-dvi A DVI previewer and a presenter for slides written in LaTeX
adiff wordwise diff
agrep agrep is a tool for the fast searching of text allowing for errors in the search pattern
aiksaurus A thesaurus lib, tool and database
an Very fast anagram generator with dictionary lookup
ansifilter Handles text files containing ANSI terminal escape codes
antiword free MS Word reader
antixls It is used to print out an XLS file with minimal formatting, or extracts the data into CSV format
apvlv Alf's PDF Viewer Like Vim
asa ASA Carriage control conversion for ouput by Fortran programs
asciidoc A text document format for writing short documents, articles, books and UNIX man pages
aspell A spell checker replacement for ispell
atril Atril document viewer for MATE
bact Boosting Algorithm for Classification of Trees
barcode barcode generator
bdf2psf Converter to generate console fonts from BDF source fonts
bibclean BibTeX bibliography prettyprinter and syntax checker
bibletime Qt Bible study application using the SWORD library
bibus Bibliographic and reference management software, integrates with L/OO.o and MS Word
bibutils Interconverts between various bibliography formats using a common XML intermediate
binfind binfind searches files for a byte sequence specified on the command line
blahtexml TeX-to-MathML converter
blogc A blog compiler
bogosort A file sorting program which uses the bogosort algorithm
build-docbook-catalog DocBook XML catalog auto-updater
c2ps Generates a beautified ps document from a source file (c/c++)
calibre Ebook management application
catdoc Converter for Microsoft Word, Excel, PowerPoint and RTF files to text
cb2bib Tool for extracting unformatted bibliographic references
cedilla UTF-8 to postscript converter
chasen Japanese Morphological Analysis System, ChaSen
chm2pdf A script that converts a CHM file into a single PDF file
clara An OCR (Optical Character Recognition) program
cmigemo C/Migemo -- Migemo library implementation in C
code2html Converts source files to colored HTML output
convertlit CLit converts MS ebook .lit files to .opf (xml+html+png+jpg)
convmv convert filenames to utf8 or any other charset
cook Embedded language which can be used as a macro preprocessor and for similar text processing
cpdf A command line tool for manipulating PDF files
crf++ Yet Another CRF toolkit for segmenting/labelling sequential data
crm114 A powerful text processing tool, mainly used for spam filtering
csvfix A stream editor for manipulating CSV files
cuneiform An enterprise quality optical character recognition (OCR) engine by Cognitive Technologies
cutemarked Qt5 markdown editor
cwtext Text to Morse Code converter
dbacl Digramic Bayesian text classifier
ddir A perl implementation of the tree(1) program
delta Heuristically minimizes interesting files
dictd Dictionary Client/Server for the DICT protocol
diction Diction and style checkers for english and german texts
diffpdf Program that textually or visually compares two PDF files
ding Tk based dictionary (German-English) (incl. dictionary itself)
discount An implementation of John Gruber's Markdown text to html language written in C
djview Portable DjVu viewer using Qt4
djvu DjVu viewers, encoders and utilities
djvusmooth Graphical editor for DjVu documents
docbook-dsssl-stylesheets DSSSL Stylesheets for DocBook
docbook-sgml A helper package for sgml docbook
docbook-sgml-dtd Docbook SGML DTD 4.5
docbook-sgml-utils Shell scripts to manage DocBook documents
docbook-xml-dtd Docbook DTD for XML
docbook-xml-simple-dtd Simplified Docbook DTD for XML
docbook-xsl-ns-stylesheets XSL Stylesheets for Docbook
docbook-xsl-stylesheets XSL Stylesheets for Docbook
docbook2X Tools to convert docbook to man and info
docx2txt Convert MS Office docx files to plain text
dos2unix Convert DOS or MAC text files to UNIX format or vice versa
duali Arabic dictionary based on the DICT protocol
dvibook DVI file utilities: dvibook, dviconcat, dvitodvi, and dviselect
dvipdfm DVI to PDF translator
dvipdfmx DVI to PDF translator with multi-byte character support
dvipng Translate DVI files into PNG or GIF graphics
dvipsk DVI-to-PostScript translator
dvisvgm Converts DVI files to SVG
ebook-tools Tools for accessing and converting various ebook file formats
enchant Spellchecker wrapping library
enscript powerful text-to-postscript converter
epspdf GUI and command-line converter for [e]ps and pdf
epstool Creates or extracts preview images in EPS files, fixes bounding boxes,converts to bitmaps
epubcheck Tool to validate IDPF EPUB files
evince Simple document viewer for GNOME
expander Expander is a utility that acts as a filter for text editors
fbless Python-based console fb2 reader with less-like interface
fbreader E-Book Reader. Supports many e-book formats
fdftk Acrobat FDF Toolkit
flpsed Pseudo PostScript editor
freepwing FreePWING is a free JIS X 4081 (subset of EPWING V1) formatter
gentoo-guide-xml-dtd DTD for Gentoo-Guide Style XML Files
getxbook Download books from google, amazon, barnes and noble
ghostscript-gpl Ghostscript is an interpreter for the PostScript language and for PDF
glark File searcher similar to grep but with fancy output
glosung Watch word program for the GNOME2 desktop (watch word (german): losung)
gnome-doc-utils A collection of documentation utilities for the Gnome project
gnopaster A submitter for gnopaste, a nopaste service like http://nopaste.info
gocr An OCR (Optical Character Recognition) reader
goldendict Feature-rich dictionary lookup program
groonga An Embeddable Fulltext Search Engine
groonga-normalizer-mysql Groonga plugin that provides MySQL compatible normalizers
grutatxt A converter from plain text to HTML and other markup languages
gtkspell Spell checking widget for GTK
gtranslator An enhanced gettext po file editor for GNOME
gv Viewer for PostScript and PDF documents using Ghostscript
hd2u Dos2Unix like text file converter
highlight converts source code to formatted text ((X)HTML, RTF, (La)TeX, XSL-FO, XML) with syntax highlight
hnb A program to organize many kinds of data in one place
htag random signature maker
html-xml-utils A number of simple utilities for manipulating HTML and XML files
html2text A HTML to text converter
html401 DTDs for the HyperText Markup Language 4.01
htmlc HTML template files expander
htmldoc Convert HTML pages into a PDF document
htmlinc HTML Include System by Ulli Meybohm
htmlrecode Recodes HTML file using a new character set
htmltidy Tidy the layout and correct errors in HTML and XML documents
htp An HTML preprocessor
hunspell Hunspell spell checker - an improved replacement for myspell in OOo
hyperestraier a full-text search system for communities
info2html Converts GNU .info files to HTML
iso-codes ISO language, territory, currency, script codes and their translations
itex2mml A LaTeX into XHTML/MathML converter
jabref Java GUI for managing BibTeX and other bibliographies
jabref-bin Java GUI for managing BibTeX and other bibliographies
jadetex TeX macros used by Jade TeX output
jist A ruby gem to publish a gist
jmupdf Java library for rendering PDF, XPS and CBZ (Comic Book) documents
jpdftweak Swiss Army Knife for PDF files
kbibtex BibTeX editor for KDE to edit bibliographies used with LaTeX
kchmviewer A feature rich chm file viewer, based on Qt
kding KDE port of Ding, a dictionary lookup program
keepnote A note taking application
kpaste Command-line tool to paste to paste.kde.org
krop A tool to crop PDF files
landslide Landslide generates a slideshow using the slides that power the html5-slides presentation
lcdf-typetools Font utilities for eg manipulating OTF
letterize Generate English-plausible alphabetic mnemonics for a phone number
libabw Library parsing abiword documents
libebook Library parsing various ebook formats
libetonyek Library parsing Apple Keynote presentations
libexttextcat Library implementing N-gram-based text categorization
libgxps Library for handling and rendering XPS documents
liblangtag An interface library to access tags for identifying languages
libmspub Library parsing Microsoft Publisher documents
libmwaw Library parsing many pre-OSX MAC text formats
libodfgen Library to generate ODF documents from libwpd and libwpg
libpaper Library for handling paper characteristics
libspectre A library for rendering Postscript documents
libwpd WordPerfect Document import/export library
libwpg C++ library to read and parse graphics in WPG
libwps Microsoft Works file word processor format import filter library
libxmlpatch A set of tools to create and apply patch to XML files using XPath
linuxdoc-tools A toolset for processing LinuxDoc DTD SGML files
llpp a graphical PDF viewer which aims to superficially resemble less(1)
lodgeit Command-line interface to paste.pocoo.org
logmerge Merge multiple logs such that multilined entries appear in chronological order without breaks
lout high-level language for document formatting
lv Powerful Multilingual File Viewer
manpager Enable colorization of man pages
mathtex MathTeX lets you easily embed LaTeX math in your own html pages, blogs, wikis, etc
mbtpdfasm Tool to assemble/merge PDF files, extract information from, and update the metadata in PDF files
mecab Yet Another Part-of-Speech and Morphological Analyzer
mftrace Traces TeX fonts to PFA or PFB fonts (formerly pktrace)
mht-rip convert mht/mhtml files to something usable
mpage Many to one page printing utility
multitail Tail with multiple windows
mupdf a lightweight PDF viewer and toolkit written in portable C
mythes A simple thesaurus for Libreoffice
namazu Namazu is a full-text search engine
nfoview simple viewer for NFO files, which are ASCII art in the CP437 codepage
notecase Hierarchical note manager written using GTK+ and C++
noweb a literate programming tool, lighter than web
o3read Converts OpenOffice formats to text or HTML
ocrad GNU Ocrad is an OCR (Optical Character Recognition) program
odt2txt A simple converter from OpenDocument Text to plain text
openjade Jade is an implementation of DSSSL - an ISO standard for formatting SGML and XML documents
openlp Free church presentation software
opensp A free, object-oriented toolkit for SGML parsing and entity management
ots Open source Text Summarizer, as used in newer releases of abiword and kword
pandoc Conversion between markup formats
paperwork a personal document manager for scanned documents (and PDFs)
paps Unicode-aware text to PostScript converter
par a paragraph reformatter, vaguely similar to fmt, but better
passivetex A namespace-aware XML parser written in Tex
pastebinit A software that lets you send anything you want directly to a pastebin from the command line.
pdf2djvu A tool to create DjVu files from PDF files
pdf2html Converts pdf files to html files
pdf2oo Converts pdf files to odf
pdfgrep A tool similar to grep which searches text in PDFs
pdfjam pdfnup, pdfjoin and pdf90
pdfminer Python tool for extracting information from PDF documents
pdfsandwich generator of sandwich OCR pdf files
pdfshuffler GUI app that can merge or split pdfs and rotate, crop and rearrange their pages
pdftk A tool for manipulating PDF documents
peg-markdown Implementation of markdown in C, using a PEG grammar
pelican A tool to generate a static blog, with restructured text (or markdown) input files
pep General purpose filter and file cleaning program
pinfo Hypertext info and man viewer based on (n)curses
po4a Tools for helping translation of documentation
podofo PoDoFo is a C++ library to work with the PDF file format
poppler PDF rendering library based on the xpdf-3.0 code base
poppler-data Data files for poppler to support uncommon encodings without xpdfrc
ps2eps Generate Encapsulated Postscript Format (EPS,EPSF) files from one-page Postscript documents
ps2pkm Tool that converts a PostScript type1 font into a corresponding TeX PK font
psiconv An interpreter for Psion 5(MX) file formats
pspdftool Tool for prepress preparation of PDF and PostScript documents
pspresent A tool to display full-screen PostScript presentations
pstotext Extract ASCII text from a PostScript or PDF file
psutils PostScript Utilities
pybookreader A book reader for .fb2 .html and plain text (possibly gzipped)
pylize Python HTML Slideshow Generator using HTML and CSS
pytextile A Python implementation of Textile, Dean Allen's Human Text Generator for creating (X)HTML
qpdf A command-line program that does structural, content-preserving transformations on PDF files
qpdfview A tabbed document viewer
queequeg A checker for English grammar, for people who are not native English
rarian A documentation metadata library
recode Convert files between various character sets
reed This is a text pager (text file viewer), used to display etexts
refbase Web-based solution for managing scientific literature, references and citations
referencer Gnome application to organise documents or references, and to generate BibTeX bibliography files
restview reStructuredText viewer
rfcutil return all related RFCs based upon a number or a search string
rhyme Console based Rhyming Dictionary
rman PolyGlotMan man page translator AKA RosettaMan
rnv A lightweight Relax NG Compact Syntax validator
robodoc Automating Software Documentation
ronn Ronn converts simple, human readable textfiles to roff for terminal display, and also to HTML
rpl Intelligent recursive search/replace utility
rtf2html RTF to HTML converter
sablotron An XSLT Parser in C++
sary Sary: suffix array library and tools
scrollkeeper Dummy scrollkeeper for testing rarian
scrollkeeper-dtd DTD from the Scrollkeeper package
sdcv Console version of Stardict program
searchmonkey Powerful text searches using regular expressions
sgml-common Base ISO character entities and utilities for SGML
sgmltools-lite Python interface to SGML software in a DocBook/OpenJade env
sgrep Use structural criteria to grep and index text, SGML, XML and HTML and filter text streams
sigil Sigil is a multi-platform WYSIWYG ebook editor for ePub format
silvercity A lexical analyser for many languages
skribe Skribe is a text processor for technical documents written in scheme
sloccount Tools for counting Source Lines of Code (SLOC) for a large number of languages
spellutils spellutils includes 'newsbody' (useful for spellchecking in mails, etc.)
stardict A international dictionary supporting fuzzy and glob style matching
sword Library for Bible reading software
sword-modules A collection of modules for the SWORD project
t1utils Type 1 Font utilities
tabler A utility to create text art tables from delimited input
talkfilters Convert ordinary English text into text that mimics a stereotyped or otherwise humorous dialect
teckit Text Encoding Conversion toolkit
teseq a tool for analyzing files that contain control characters and terminal control sequences
tesseract An OCR Engine, orginally developed at HP, now open source.
texi2html Perl script that converts Texinfo to HTML
texlive A complete TeX distribution
texlive-core A complete TeX distribution
tkinfo Info Browser in TK
tkman TkMan man and info page browser
tofrodos text file conversion utility that converts ASCII files between the MSDOS format and the Unix format
tokyodystopia A fulltext search engine for Tokyo Cabinet
trang Multi-format schema converter based on RELAX NG
tree Lists directories recursively, and produces an indented listing of files
ttf2pk2 Freetype 2 based TrueType font to TeX's PK format converter
ttf2pt1 True Type Font to Postscript Type 1 Converter
tuxcards A hierarchical notebook
txt2man Scripts to convert regular ASCII text to man pages
txt2pdbdoc Text/HTML to Doc file converter for the Palm Pilot
txt2tags A tool for generating marked up documents (HTML, SGML, ...) from a plain text file with markup
u2ps A text to PostScript converter like a2ps, but supports UTF-8
unac Library and command-line tool for removing accents from characters
unpaper Post-processor for scanned and photocopied book pages
unrtf Converts RTF files to various formats
uudeview uu, xx, base64, binhex decoder
uvconv A small utility that converts among Vietnamese charsets
vilistextum Html to ascii converter specifically programmed to get the best out of incorrect html
vlna Add nonbreakable spaces after some prepositions in Czech texts
wdiff Create a diff disregarding formatting
webgen A template-based static website generator
wgetpaste Command-line interface to various pastebins
wiki2beamer Tool to produce LaTeX Beamer code from wiki-like input
winefish LaTeX editor based on Bluefish
wklej A wklej.org submitter
writerperfect Various formats to Open document format converter
wscr A Lightweight and Fast Anagram Solver
wv Tool for conversion of MSWord doc and rtf files to something readable
wv2 Excellent MS Word filter lib, used in most Office suites
xapian-omega An application built on Xapian, consisting of indexers and a CGI search frontend
xchm Utility for viewing Compiled HTML Help (CHM) files
xdvik DVI previewer for X Window System
xdvipdfmx Extended dvipdfmx for use with XeTeX and other unicode TeXs
xfbib a lightweight BibTeX editor
xhtml1 DTDs for the eXtensible HyperText Markup Language 1.0
xindy A Flexible Indexing System
xiphos A bible study frontend for Sword (formerly known as GnomeSword)
xlhtml Convert MS Excel and Powerpoint files to HTML
xlsx2csv Convert MS Office xlsx files to CSV
XML-Schema-learner Algorithmic inferencing of XML schema definitions and DTDs
xml2 These tools are used to convert XML and HTML to and from a line-oriented format
xml2doc Tool to convert simple XML to a variety of formats (pdf, html, txt, manpage)
xmldiff A tool that figures out the differences between two similar XML files
xmlformat Reformat XML documents to your custom style
xmlstarlet A set of tools to transform, query, validate, and edit XML documents
xmlto script for converting XML and DocBook formatted documents to a variety of output formats
xournal Xournal is an application for notetaking, sketching, and keeping a journal using a stylus
yagf Graphical front-end for cuneiform and tesseract OCR tools
yelp-tools Collection of tools for building and converting documentation
yodl Your Own Document Language: a pre-document language and tools to process it
zathura A highly customizable and functional document viewer
zathura-cb Comic book plug-in for zathura with 7zip, rar, tar and zip support
zathura-djvu DjVu plug-in for zathura
zathura-meta Meta package for app-text/zathura plugins
zathura-pdf-mupdf PDF plug-in for zathura
zathura-pdf-poppler PDF plug-in for zathura
zathura-ps PostScript plug-in for zathura
zemberek-server A Turkish spell checker server based on Zemberek NLP library
zpspell Zemberek-Pardus spell checker interface


316 Packages