Here's a list of commands that you should execute and then share the output to someone who can help you figure out what resource is the bottleneck in your system. Note, it requires the sysstat and procps packages (ubuntu and RHEL and its derivatives): uptime vmstat 1 10 iostat -xN 2 10 mpstat -P ALL 3 10 pidstat 1 10 free -mw (or free -m, if your OS doesn't support -w) uptime is to see the load averages on the system. vmstat is mostly used to tell if the system is swapping or not. If you see significant numbers in the 'si' and 'so' columns, your system is most likely swapping (using the hard drive as RAM), which usually slows performance a lot. iostat is mostly used to determine if your disk subsystem is not able to cope with the load. If you see one or more lines that shows 100 or close almost constantly, it is probably the case. If it is your swap volume, you probably saw numbers in the 'si' and 'so' columns in the vmsta...
One feature that is lacking from Networker, compared to some of its competitors, is the built-in automatic recovery testing. However, when there's an API, there is a way. Networker's REST API is not perfect, but it allows the backup administrator to perform queries about Networker resources (objects). As my workload is going up, I realize that one of the tasks that I tend to skip the most is the periodical recovery tests. Don't forget that a a backup that is not tested should be considered non-successful. I also found out that my recovery tests were not diverse enough. When I started this project, I knew a little bit about REST APIs, and nothing about JSON processing. With the Networker REST API documentation, and the help of a friend and Networker Support staff, I was able to create HTTP queries with Postman, cURL and jq . Once I got the queries that I needed, I put them in a bash script that would somehow select one backup, and then restore it. My first attem...
Here are a few tips that I have discovered or implemented during my career, that can help anyone get better results: Follow the 3-2-1 backup strategy or better 3 copies of the data 3 different media 1 copy offsite Test your backups regularly. Ideally, automate your recovery tests. Use Whatever-as-code as much as you can. IaC is the first that comes to my mind, but using code to define objects and their properties have several advantages: Auto-documentation of changes Ability to easily rollback (using source code management like Git) Makes it easier for standardization an compliance Ansible, Chef, Puppet, Salt, Terraform, Pulumi are good examples Maintain a changelog of all the major changes in your infrastructure (that isn't already "documented" because you're using IaC). Keep track of your hardware inventory and plan your renewals and warranty extension purchases. Build a spreadsheet with all the components that you manage and make sure that everyone on the sysadmi...
Comments