"Science is an error-correcting process." β Charles S. Peirce
I'm a doctoral researcher at Tampere University, specializing in machine learning for audio understanding. I'm passionate about teaching machines to hear, interpret, and respond to sound like humans do.
Before entering research, I spent over 7 years as a software engineer, giving me a strong foundation in building scalable systems and solving real-world problems. I now work at the intersection of ML research, software engineering, and AI-driven audio applications, combining scientific depth with hands-on development skills.
- π΅ Machine Learning for Audio Understanding (classification, detection, retrieval, generation)
- π Self-Supervised and Contrastive Learning
- π Multimodal Learning (audio + text/vision)
- 𧩠Low-Resource Learning (zero-shot, few-shot)
- π» Programming: Python, Java, Scala, Kotlin, C/C++, GDScript, JavaScript, SQL, HTML/CSS, R, Matlab, LaTeX
- βοΈ ML & Data: PyTorch, TensorFlow, scikit-learn, Ray Tune, MLflow, Spark, NumPy, SciPy, Pandas, Jupyter
- π£οΈ Audio / NLP: librosa, torchaudio, NLTK
- π Web & Backend: Spring Boot, Java EE, Hibernate, Django, Flask
- βοΈ Databases & DevOps: MySQL, PostgreSQL, Docker, Git, Linux
- π Multimodal Audio-Text Retrieval System β Developing models that match audio clips with text queries using multimodal learning.
- π€ X-GoBot β π§[WIP] Developing a voice-enabled desktop AI assistant with local processing and contextual awareness.
I'm always open to conversations about audio ML, applied AI, or building smart, sound-aware systems. Whether you're in research, industry, or tinkering with side projects β feel free to reach out!
π« Email: huang.xie@outlook.com
π LinkedIn: linkedin.com/in/huang-xie-28b7872bb