Sandia National Laboratories Information Systems Architect (Experienced) in Albuquerque, New Mexico
We are seeking an Information Systems Architect to join a team that administers next-generation high performance computing (HPC) Architectures and develops innovative solutions for operations and efficient utilization. Are you passionate about using your analytical skills to craft solutions to complex technical issues? Are you passionate about your work and want to utilize state-of-the-art facilities to explore solutions? Do you want to join a dynamic team that solves challenging issues for the nation's security? If so, you could be the highly motivated individual we are seeking to join our team.
On any given day, you may be called on to:
- Participate in all aspects of the HPC system lifecycle including facility integration, standup, acceptance testing, performance benchmarking, operational support, and reclamation.
- Maintain all system aspects of security, networks, filesystems, system software installation, and user support.
- Plan patches and upgrades.
- Troubleshoot/replace defective components
- Determine and proactively mitigatethe impact of system changes and improvements, including non-standard research access and software, on the security and stability of the platforms and on the user community.
- Participate in development of new operational methodologies and support infrastructure to enable efficient operations of multiple, concurrent, leading-edge and prototype HPC Clusters.
- Support research and development staff to deliver functional platforms for pre-production systems.
- Work with the HPC Monitoring Team to deploy monitoring solutions on the platforms and utilize them for performance understanding
- Learn new technologies, processes, and system software in an unstructured environment.
- Bachelor’s degree in Computer Science, Computer Engineering, Information Systems Engineering (CIS/MIS), or significant STEM discipline plus five or more years of significant IT experience; or nine plus years of relevant IT experience with achievements that demonstrate the knowledge, skills and ability to perform the duties of the job.
- 5 years’ experience with Linux /Unix operating systems, including hardware setup, installation, upgrades, and troubleshooting.
- Experience handling large numbers of systems (20-50+).
- Experience answering technical support queries from a community of end-users
- A strong teammate capable of handling multiple duties with good customer focus and excellent oral and interpersonal skills.
- Ability to obtain and maintain a DOE Q security clearance.
- Proven understanding/experience installing clusters (initial racking/cabling/diagnosing hardware issues)
- Advanced scripting experience (Shell, Python, PERL, or any other system-level scripting).
- Basic understanding of networking concepts and infrastructure, such as firewalls, routing, bonding, and VLANs
- Knowledge and experience with security and authentication components, such as ssh, Kerberos, LDAP, SSL, nmap, public and private key encryption, and other third party security products.
- Ability to implement and review system performance monitoring, determine optimization opportunities, and set new resource/capacity requirements.
- Experience with Linux containers and related technologies (docker, kubernetes, etc.)
- Experience with storage administration, fiber channel SAN, LUN provisioning, NFS filesystem management, and related processes and technologies
- Understanding of HPC scheduling software (e.g. LSF, Moab, or SLURM)
- Understanding of HPC parallel filesystems (e.g. GPFS, Lustre)
- Understanding of HPC Interconnects (e.g., Infiniband, OmniPath, high-speed ethernet)
- Current DOE Q Security Clearance
The High Performance Computing (HPC) Development Department supports and develops innovative solutions for the operation and efficient utilization of leading and next-generation computing systems. The Heterogeneous Advanced Architecture testbed Platforms (HAAPs) represent small instances of the most currently available technology in computing so that code developers and computer science researchers can test and evaluate candidate advanced processors and accelerators. Our Advanced Technology Systems (ATS) testbeds enable porting of codes within Sandia's network environment in preparation to run production calculations on the extreme-scale ATS platforms sited at LANL and LLNL. The HPC Monitoring team develops monitoring, analysis, and response software and methodologies to enable new insights into the performance and utilization of the platforms and the applications running on them.
Sandia National Laboratories is the nation’s premier science and engineering lab for national security and technology innovation, with teams of specialists focused on cutting-edge work in a broad array of areas. Some of the main reasons we love our jobs:
- Challenging work withamazingimpact that contributes to security, peace, and freedom worldwide
- Extraordinary co-workers
- Some of the best tools, equipment, and research facilities in the world
- Career advancement and enrichment opportunities
- Flexible schedules, generous vacations,strongmedical and other benefits, competitive 401k, learning opportunities, relocation assistance and amenities aimed at creating a solid work/life balance*
World-changing technologies. Life-changing careers. Learn more about Sandia at: http://www.sandia.gov
*These benefits vary by job classification.
Position requires a Department of Energy (DOE) Q-level security clearance.
Sandia is required by DOE to conduct a pre-employment drug test and background review that includes checks of personal references, credit, law enforcement records, and employment/education verifications. Applicants for employment must be able to obtain and maintain a DOE Q-level security clearance, which requires U.S. citizenship. If you hold more than one citizenship (i.e., of the U.S. and another country), your ability to obtain a security clearance may be impacted.
Applicants offered employment with Sandia are subject to a federal background investigation to meet the requirements for access to classified information or matter if the duties of the position require a DOE security clearance. Substance abuse or illegal drug use, falsification of information, criminal activity, serious misconduct or other indicators of untrustworthiness can cause a clearance to be denied or terminated by DOE, resulting in the inability to perform the duties assigned and subsequent termination of employment.
All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, or veteran status.