Site Reliability Engineer

Taipei / KKCompany - Engineering / Permanent

KKCompany is Asia’s leading music entertainment company. Started by a group of music loving Internet software developers, we built and launched one of the world’s first music streaming services in 2005. Based in Taipei, the heart of Chinese pop music, we gradually grew our business from Taiwan out to Hong Kong, Singapore, Malaysia and Japan. Ever curious towards reinvention and discovering new business models of the future, we have expanded our business scope from music streaming to live events, technology services, content, investments and continue to explore reinvention through innovation in the digital entertainment space.


  • Engage in and improve the lifecycle of services from design to deployment, operation and refinement.
  • Support services before production stage through system design consulting, platform development, capacity planning and launch reviews.
  • Maintain services in production stage by monitoring availability, performance, resources and other related metrics.
  • Construct and scale systems or platforms through automation and infrastructure as code.
  • Practice incident response and blameless post-mortems.
  • Requirements:

  • Have experience in one or more of the following: Go, Perl, Python, Ruby or shell scripting.
  • Have experience with Unix/Linux administration (filesystems, processes, signals, etc) and networking (TCP/IP, HTTP, DNS, etc).
  • Have experience in architecting system with high availability, reliability, scalability and security.
  • Have experience with version control system (Git, Mercurial, SVN, etc).
  • Nice To Have:

  • Have experience in automating infrastructure configuration (Ansible, Chef, Puppet, Terraform, etc) and monitoring (Cacti, Nagios, Prometheus, etc).
  • Have experience in operating containerized environment (Docker Swarm, Kubernetes, Nomad, etc).
  • Have experience in managing distributed systems in cloud environment such as AWS or GCP.
  • Have experience in analyzing and troubleshooting large scale distributed systems.
  • Have experience in technical writing or documentation.
  • Apply Now