Learning from Fine-Grained and Long-Tailed Visual Data