Under supervision, perform activities and team leadership required to support a large, complex Linux based computing environment and an increasing transition to Linux infrastructure in AWS. Assist in driving “infrastructure as code” mentality throughout the organization and demonstrate a passion for automation concepts and tools. Utilize customer service skills while acting as a technical resource to internal departments and system users. Use technical skills to proactively put scripts and documentation in place to comply with current standards.
Primary Duties and Responsibilities:
- Provide advanced system administration, operational support and problem resolution for a large complex Linux computing environment, including both virtualized and physical servers
- Create and Patch AMIs, perform pull requests, write Automation code
- Perform Linux administration including changes, deletes, disk space management, application installation, and backup
- Use your infrastructure and networking knowledge to maintain cloud based infrastructure predominantly on AWS involving EC2, S3, RDS & VPC
- Use configuration management tools (primarily Ansible and Terraform) to build and maintain a hybrid infrastructure hosted both at colocation facilities and in the public cloud.
- Work directly with the development team to build supporting infrastructure for specific new application functionality.
- Run proof of concept projects on early stage infrastructure improvements to validate the feasibility of an approach, evaluate performance, and spike an implementation.
- Review and evaluate virtual and physical server performance and capacity
- Forecast system demands and recommends upgrades, expansions and reconfigurations
- Perform automated computing environment builds, site setup, user training, hardware/software installation, maintenance and support and documentation of operating procedures and processes
- Support VMware environment including changes, adding/removing systems, and disk space management.
- Troubleshoot hardware and software problems, takes appropriate corrective action and/or interact with IT staff or vendors in performing complex testing, support, server recovery, and troubleshooting functions.
- Assist with development and testing of changes needed to maintain DR environment
- Use change management process
- Comply with all audit, compliance, and regulatory requirements
- Attend meetings as a team representative
- Support on call, weekend and off hours work as needed
- Perform other duties as assigned
- Good consultative, communication, analytical, and judgment skills
- Ability to work effectively with clients, technical staff, consultants and vendors
- Ability to work well under pressure and within deadlines
- Experience with disaster recovery testing and creating technical documentation
- Ability to communicate well and perform as part of a team located in multiple cities
- Extensive knowledge of Linux operating systems, Linux shells and standard utilities, and common Linux security tools
- In depth system administration knowledge and skills for RedHat Linux. Knowledge of Amazon LINUX is a plus.
- Experience with using Github or other version control tools for source code management
- Experience using configuration management tools such as Puppet, Chef, or Ansible and container tools such as Docker
- Ability to write and maintain automation code and scripts and IaaS / Infrastructure as code, such as Terraform
- Familiarity with DevOps activities and using CICD pipeline software to deploy code
- Working knowledge of cloud components and services in AWS or Azure.
- System administration experience and knowledge of VMware and administration of virtual servers
- Grub, PXE boot, Kickstart
- Yum, rpms, Satellite server
- SVM, LVM, Boot from SAN, UFS/ZFS, filesystem configuration
- General working knowledge of NAS, SAN, and networking
- Experience with Github, Ansible, Jenkins and Terraform tools/applications
- Knowledge or experience with DevOps, OpenShift, AWS cloud, or other similar technologies is desirable
Education and/or Experience:
- Bachelor’s degree in Computer Science or a related discipline or an equivalent combination of education and work experience.
- Three or more years’ experience in Linux systems installation, operations, administration, and maintenance of physical and virtualized servers
When you find a position you're interested in, click the 'Apply' button. Please complete the application and attach your resume.
You will receive an email notification to confirm that we've received your application.
If you are called in for an interview, a representative from OCC will contact you to set up a date, time, and location.
For more information about OCC, please click here.
OCC is an Equal Opportunity Employer