Skip to content
View xieh97's full-sized avatar
:octocat:
I may be slow to respond.
:octocat:
I may be slow to respond.

Block or report xieh97

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
xieh97/README.md

πŸ‘‹ Hi, I'm Huang Xie (谒晃)

"Science is an error-correcting process." β€” Charles S. Peirce

πŸŽ“ About Me

I'm a doctoral researcher at Tampere University, specializing in machine learning for audio understanding. I'm passionate about teaching machines to hear, interpret, and respond to sound like humans do.

Before entering research, I spent over 7 years as a software engineer, giving me a strong foundation in building scalable systems and solving real-world problems. I now work at the intersection of ML research, software engineering, and AI-driven audio applications, combining scientific depth with hands-on development skills.

🧠 Research Interests

  • 🎡 Machine Learning for Audio Understanding (classification, detection, retrieval, generation)
  • πŸ” Self-Supervised and Contrastive Learning
  • πŸ”„ Multimodal Learning (audio + text/vision)
  • 🧩 Low-Resource Learning (zero-shot, few-shot)

πŸ› οΈ Skills & Tools

  • πŸ’» Programming: Python, Java, Scala, Kotlin, C/C++, GDScript, JavaScript, SQL, HTML/CSS, R, Matlab, LaTeX
  • βš›οΈ ML & Data: PyTorch, TensorFlow, scikit-learn, Ray Tune, MLflow, Spark, NumPy, SciPy, Pandas, Jupyter
  • πŸ—£οΈ Audio / NLP: librosa, torchaudio, NLTK
  • 🌐 Web & Backend: Spring Boot, Java EE, Hibernate, Django, Flask
  • βš™οΈ Databases & DevOps: MySQL, PostgreSQL, Docker, Git, Linux

πŸ§ͺ Featured Projects

  • πŸ”Ž Multimodal Audio-Text Retrieval System – Developing models that match audio clips with text queries using multimodal learning.
  • πŸ€– X-GoBot – πŸ”§[WIP] Developing a voice-enabled desktop AI assistant with local processing and contextual awareness.

πŸ’¬ Let's Connect

I'm always open to conversations about audio ML, applied AI, or building smart, sound-aware systems. Whether you're in research, industry, or tinkering with side projects β€” feel free to reach out!

πŸ“« Email: huang.xie@outlook.com
πŸ”— LinkedIn: linkedin.com/in/huang-xie-28b7872bb

Pinned Loading

  1. language-based-audio-retrieval language-based-audio-retrieval Public

    List of academic resources on Language-Based Audio Retrieval

    1

  2. text-audio-retrieval text-audio-retrieval Public

    Implementation of a dual-encoder model for Text-Audio Cross-Modal Retrieval.

    Python

  3. dcase2023-audio-retrieval dcase2023-audio-retrieval Public

    Baseline system for Language-based Audio Retrieval (Task 6B) in DCASE 2023 Challenge

    Python 10 3

  4. dcase2022-audio-retrieval dcase2022-audio-retrieval Public

    Baseline system for Language-based Audio Retrieval (Task 6B) in DCASE 2022 Challenge

    Python 7 1

  5. contrastive-negative-sampling contrastive-negative-sampling Public

    Source code for negative sampling for contrastive audio-text retrieval (ICASSP 2023)

    Python 3

  6. audiocaps-dl audiocaps-dl Public

    Python program to download AudioCaps from YouTube.com

    Python 1