NII Technical Report (NII-2006-012E)

Title Building a Terabyte-scale Web Data Collection "NW1000G-04" in the NTCIR-5 WEB Task
Authors Masao Takaku, Keizo Oyama, Akiko Aizawa, Haruko Ishikawa, Kengo Minamide, Shin Kato, Hayato Yamana, and Junya Hayashi
Abstract We built a terabyte-scale web data collection, NW1000G-04, which was used in the NTCIR-5 WEB task. This paper describes the process of building the collection and some statistics of it in detail.
Language English
Published Sep 7, 2006
Pages 8p
PDF File 06-012E.pdf

NII Technical Reports
National Institute of Informatics