Python Automation: Scripting for Productivity
Master Python automation for development productivity. Learn essential libraries, scripting techniques, and automation patterns for common development tasks.
Why Python for Automation?
Python is an excellent choice for automation due to its simplicity, extensive library ecosystem, and cross-platform compatibility. It's particularly well-suited for development automation tasks, file processing, and system administration.
Essential Python Libraries
File and Directory Operations
Libraries for file and directory manipulation:
- os: Operating system interface
- pathlib: Object-oriented file system paths
- shutil: High-level file operations
- glob: Unix shell-style pathname pattern matching
- fnmatch: Unix shell-style filename matching
Text Processing
Libraries for text manipulation and processing:
- re: Regular expression operations
- string: String manipulation utilities
- csv: CSV file reading and writing
- json: JSON data handling
- yaml: YAML file processing
Network and Web
Libraries for network operations and web automation:
- requests: HTTP library for Python
- urllib: URL handling modules
- socket: Low-level networking interface
- selenium: Web browser automation
- beautifulsoup4: HTML/XML parsing
File and Directory Automation
File Operations
Automate common file operations:
import os import shutil from pathlib import Path # Create directory os.makedirs('new_directory', exist_ok=True) # Copy file shutil.copy('source.txt', 'destination.txt') # Move file shutil.move('old_location.txt', 'new_location.txt') # Delete file os.remove('file_to_delete.txt') # List files files = os.listdir('.') for file in files: print(file)Directory Operations
Automate directory management tasks:
from pathlib import Path # Create directory structure Path('project/src/modules').mkdir(parents=True, exist_ok=True) # Find files by pattern for filepath in Path('.').glob('*.py'): print(f"Python file: {filepath}") # Recursive file search for filepath in Path('.').rglob('*.txt'): print(f"Text file: {filepath}") # Get file information file_path = Path('example.txt') if file_path.exists(): print(f"Size: {file_path.stat().st_size} bytes") print(f"Modified: {file_path.stat().st_mtime}")Text Processing Automation
File Processing
Process text files efficiently:
import re def process_log_file(filename): """Process log file and extract errors""" errors = [] with open(filename, 'r') as file: for line_num, line in enumerate(file, 1): if 'ERROR' in line: errors.append(f"Line {line_num}: {line.strip()}") return errors def replace_in_file(filename, old_text, new_text): """Replace text in file""" with open(filename, 'r') as file: content = file.read() content = content.replace(old_text, new_text) with open(filename, 'w') as file: file.write(content) # Usage errors = process_log_file('application.log') replace_in_file('config.txt', 'localhost', 'production-server')CSV and JSON Processing
Handle structured data files:
import csv import json def process_csv_file(filename): """Process CSV file and convert to JSON""" data = [] with open(filename, 'r') as file: reader = csv.DictReader(file) for row in reader: data.append(row) # Save as JSON with open('output.json', 'w') as file: json.dump(data, file, indent=2) return data def filter_json_data(filename, condition): """Filter JSON data based on condition""" with open(filename, 'r') as file: data = json.load(file) filtered_data = [item for item in data if condition(item)] with open('filtered.json', 'w') as file: json.dump(filtered_data, file, indent=2) return filtered_dataWeb Automation
HTTP Requests
Automate web requests and API calls:
import requests import json def fetch_api_data(url, headers=None): """Fetch data from API""" try: response = requests.get(url, headers=headers) response.raise_for_status() return response.json() except requests.exceptions.RequestException as e: print(f"Error fetching data: {e}") return None def post_data(url, data, headers=None): """Post data to API""" try: response = requests.post(url, json=data, headers=headers) response.raise_for_status() return response.json() except requests.exceptions.RequestException as e: print(f"Error posting data: {e}") return None # Usage api_data = fetch_api_data('https://api.example.com/data') if api_data: print(f"Fetched {len(api_data)} items")Web Scraping
Automate web scraping tasks:
from bs4 import BeautifulSoup import requests def scrape_website(url, selector): """Scrape data from website""" try: response = requests.get(url) response.raise_for_status() soup = BeautifulSoup(response.content, 'html.parser') elements = soup.select(selector) data = [] for element in elements: data.append(element.get_text().strip()) return data except requests.exceptions.RequestException as e: print(f"Error scraping website: {e}") return [] def scrape_links(url): """Extract all links from webpage""" try: response = requests.get(url) response.raise_for_status() soup = BeautifulSoup(response.content, 'html.parser') links = soup.find_all('a', href=True) return [link['href'] for link in links] except requests.exceptions.RequestException as e: print(f"Error scraping links: {e}") return []System Administration
Process Management
Automate system process management:
import subprocess import psutil import time def run_command(command): """Run system command and return output""" try: result = subprocess.run( command, shell=True, capture_output=True, text=True ) return result.stdout, result.stderr, result.returncode except Exception as e: print(f"Error running command: {e}") return None, None, -1 def monitor_process(process_name): """Monitor specific process""" for proc in psutil.process_iter(['pid', 'name', 'cpu_percent']): if process_name in proc.info['name']: print("Process found: " + proc.info['name'] + " (PID: " + str(proc.info['pid']) + ")") def kill_process_by_name(process_name): """Kill process by name""" for proc in psutil.process_iter(['pid', 'name']): if process_name in proc.info['name']: try: proc.kill() print(f"Killed process {proc.info['name']} (PID: {proc.info['pid']})") except psutil.NoSuchProcess: print(f"Process {proc.info['name']} not found")System Monitoring
Monitor system resources and performance:
import psutil import time import json def get_system_info(): # Get comprehensive system information using psutil cpu = psutil.cpu_percent(interval=1) memory = psutil.virtual_memory()._asdict() disk = psutil.disk_usage('/')._asdict() network = psutil.net_io_counters()._asdict() processes = len(psutil.pids()) # Returns system metrics return cpu, memory, disk, network, processes def monitor_system(duration, interval): # Monitor system for specified duration start_time = time.time() data = [] while time.time() - start_time < duration: metrics = get_system_info() data.append(metrics) time.sleep(interval) return data def save_system_log(filename, data): """Save system monitoring data to file""" with open(filename, 'w') as file: json.dump(data, file, indent=2)Database Automation
Database Operations
Automate database operations:
import sqlite3 import pandas as pd def create_database(db_name): """Create SQLite database with tables""" conn = sqlite3.connect(db_name) cursor = conn.cursor() cursor.execute(''' CREATE TABLE IF NOT EXISTS users ( id INTEGER PRIMARY KEY, name TEXT NOT NULL, email TEXT UNIQUE, created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ) ''') conn.commit() conn.close() def insert_data(db_name, data): """Insert data into database""" conn = sqlite3.connect(db_name) cursor = conn.cursor() for record in data: cursor.execute( "INSERT INTO users (name, email) VALUES (?, ?)", (record['name'], record['email']) ) conn.commit() conn.close() def query_data(db_name, query): """Query data from database""" conn = sqlite3.connect(db_name) df = pd.read_sql_query(query, conn) conn.close() return dfEmail Automation
Email Sending
Automate email operations:
import smtplib from email.mime.text import MIMEText from email.mime.multipart import MIMEMultipart from email.mime.base import MIMEBase from email import encoders def send_email(sender, recipient, subject, body, attachment=None): """Send email with optional attachment""" msg = MIMEMultipart() msg['From'] = sender msg['To'] = recipient msg['Subject'] = subject msg.attach(MIMEText(body, 'plain')) if attachment: with open(attachment, 'rb') as file: part = MIMEBase('application', 'octet-stream') part.set_payload(file.read()) encoders.encode_base64(part) part.add_header( 'Content-Disposition', f'attachment; filename= {attachment}' ) msg.attach(part) try: server = smtplib.SMTP('smtp.gmail.com', 587) server.starttls() server.login(sender, 'your_password') text = msg.as_string() server.sendmail(sender, recipient, text) server.quit() print("Email sent successfully") except Exception as e: print(f"Error sending email: {e}")Task Scheduling
Automated Task Execution
Schedule and execute automated tasks:
import schedule import time import logging # Set up logging logging.basicConfig( level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s' ) def daily_backup(): """Perform daily backup""" logging.info("Starting daily backup...") # Backup logic here logging.info("Daily backup completed") def weekly_cleanup(): """Perform weekly cleanup""" logging.info("Starting weekly cleanup...") # Cleanup logic here logging.info("Weekly cleanup completed") def hourly_monitoring(): """Perform hourly system monitoring""" logging.info("Performing system monitoring...") # Monitoring logic here logging.info("System monitoring completed") # Schedule tasks schedule.every().day.at("02:00").do(daily_backup) schedule.every().sunday.at("03:00").do(weekly_cleanup) schedule.every().hour.do(hourly_monitoring) def run_scheduler(): """Run the task scheduler""" while True: schedule.run_pending() time.sleep(60)Error Handling and Logging
Robust Error Handling
Implement proper error handling in automation scripts:
import logging import traceback from functools import wraps def setup_logging(): """Set up logging configuration""" logging.basicConfig( level=logging.INFO, format='%(asctime)s - %(name)s - %(levelname)s - %(message)s', handlers=[ logging.FileHandler('automation.log'), logging.StreamHandler() ] ) def error_handler(func): """Decorator for error handling""" @wraps(func) def wrapper(*args, **kwargs): try: return func(*args, **kwargs) except Exception as e: logging.error(f"Error in {func.__name__}: {str(e)}") logging.error(traceback.format_exc()) return None return wrapper @error_handler def risky_operation(): """Example of risky operation with error handling""" # Some operation that might fail result = 1 / 0 # This will raise an exception return resultBest Practices
Code Organization
Organize automation scripts effectively:
- Modular Design: Break scripts into functions and modules
- Configuration Files: Use config files for settings
- Error Handling: Implement comprehensive error handling
- Logging: Add detailed logging for debugging
- Documentation: Document functions and scripts
Performance Optimization
Optimize automation scripts for performance:
- Batch Operations: Process multiple items at once
- Async Operations: Use async/await for I/O operations
- Memory Management: Manage memory usage efficiently
- Caching: Cache frequently accessed data
- Resource Cleanup: Properly close files and connections
Conclusion
Python automation is a powerful tool for increasing development productivity. By mastering the essential libraries and techniques, you can automate repetitive tasks, streamline workflows, and focus on more important work.
Start with simple automation tasks and gradually build more complex solutions. Remember to always test your scripts thoroughly and implement proper error handling and logging for production use.