Structured sparsity in structured prediction

Martins A.F.T., Smith N.A., Aguiar P.M.Q., Figueiredo M.A.T.

EMNLP 2011 - Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference

2011

pp 1500

-

1511

Abstract:

Linear models have enjoyed great success in structured prediction in NLP. While a lot of progress has been made on efficient training with several loss functions, the problem of endowing learners with a mechanism for feature selection is still unsolved. Common approaches employ ad hoc filtering or L1-regularization; both ignore the structure of the feature space, preventing practicioners from encoding structural prior knowledge. We fill this gap by adopting regularizers that promote structured sparsity, along with efficient algorithms to handle them. Experiments on three tasks (chunking, entity recognition, and dependency parsing) show gains in performance, compactness, and model interpretability.