University of Illinois Computational Linguistics Lab

 
 
RESOURCES

Lab

5 Mandrake Linux stations  
Server space  

Corpora

Arabic Gigaword  
Arabic Treebank Part 2, v2.0  
Arabic Treebank Part 3, v1.0  
Buckwalter Morphological Analyzer  
CallHome Egyptian Arabic Lexicon  
CallHome Egyptian Arabic Transcripts  

Chinese Gigaword

 
Chinese Treebank 4.0  

English Gigaword

 
Non-Standard Word Corpora  

SwitchBoard

 
XTAG Morphological Lexicon  

WS04 Dialectal Chinese Automatic Speech Recognition corpora

 

Tools

Lextools: Finite state toolkit for linguistic analysis  
FSM tools: Finite State Machine library
 
GRM tools: Grammar library  
CART Tree tools: CART tree toolkit
 
TextBooster: Boosting toolkit  
Festival: Text-to-Speech
 
CLUTO: Clustering toolkit  
HTK: Speech recognition toolkit
 

 



Beckman 1420
Last Update: December 30, 2005