NestJS健康检查
学习目标
- 掌握NestJS健康检查模块的使用方法
- 理解健康指标的定义和使用场景
- 学习如何集成健康检查到监控系统
- 了解就绪检查和存活检查的区别
- 掌握健康检查的最佳实践和常见问题
核心知识点
1. 健康检查简介
健康检查是监控应用程序状态的重要手段,它可以帮助我们及时发现和解决应用程序的问题。在NestJS中,健康检查功能通过@nestjs/terminus包提供。健康检查模块支持以下功能:
- 检查应用程序的运行状态
- 检查依赖服务的状态(如数据库、Redis等)
- 提供健康指标的可视化
- 集成到监控系统(如Prometheus、Grafana等)
- 支持Kubernetes的就绪检查和存活检查
2. 安装和配置
首先,我们需要安装健康检查模块:
npm install --save @nestjs/terminus然后,在应用的根模块中导入并配置健康检查模块:
// src/app.module.ts
import { Module } from '@nestjs/common';
import { TerminusModule } from '@nestjs/terminus';
import { AppController } from './app.controller';
import { AppService } from './app.service';
@Module({
imports: [TerminusModule],
controllers: [AppController],
providers: [AppService],
})
export class AppModule {}3. 基本使用
3.1 创建健康检查控制器
// src/health/health.controller.ts
import { Controller, Get } from '@nestjs/common';
import { HealthCheck, HealthCheckService } from '@nestjs/terminus';
@Controller('health')
export class HealthController {
constructor(private health: HealthCheckService) {}
@Get()
@HealthCheck()
async check() {
return this.health.check([]);
}
}3.2 添加健康检查指标
我们可以添加各种健康检查指标,如数据库连接、Redis连接、HTTP服务等。
// src/health/health.controller.ts
import { Controller, Get } from '@nestjs/common';
import { HealthCheck, HealthCheckService, HttpHealthIndicator, TypeOrmHealthIndicator } from '@nestjs/terminus';
@Controller('health')
export class HealthController {
constructor(
private health: HealthCheckService,
private http: HttpHealthIndicator,
private db: TypeOrmHealthIndicator,
) {}
@Get()
@HealthCheck()
async check() {
return this.health.check([
() => this.http.pingCheck('nestjs-docs', 'https://docs.nestjs.com'),
() => this.db.pingCheck('database'),
]);
}
}4. 健康检查指标
NestJS的健康检查模块提供了以下内置的健康检查指标:
HttpHealthIndicator:检查HTTP服务的状态TypeOrmHealthIndicator:检查TypeORM数据库连接的状态MongooseHealthIndicator:检查MongoDB连接的状态RedisHealthIndicator:检查Redis连接的状态MicroserviceHealthIndicator:检查微服务的状态
5. 自定义健康检查指标
除了使用内置的健康检查指标外,我们还可以创建自定义的健康检查指标。
// src/health/custom.health.ts
import { Injectable } from '@nestjs/common';
import { HealthIndicator, HealthIndicatorResult, HealthCheckError } from '@nestjs/terminus';
@Injectable()
export class CustomHealthIndicator extends HealthIndicator {
async isHealthy(key: string, options: { threshold: number }): Promise<HealthIndicatorResult> {
// 模拟检查逻辑
const healthStatus = Math.random() > options.threshold;
const result = this.getStatus(key, healthStatus, {
message: healthStatus ? 'Service is healthy' : 'Service is unhealthy',
});
if (!healthStatus) {
throw new HealthCheckError('Custom health check failed', result);
}
return result;
}
}然后在健康检查控制器中使用:
// src/health/health.controller.ts
import { Controller, Get } from '@nestjs/common';
import { HealthCheck, HealthCheckService } from '@nestjs/terminus';
import { CustomHealthIndicator } from './custom.health';
@Controller('health')
export class HealthController {
constructor(
private health: HealthCheckService,
private customHealthIndicator: CustomHealthIndicator,
) {}
@Get()
@HealthCheck()
async check() {
return this.health.check([
() => this.customHealthIndicator.isHealthy('custom-service', { threshold: 0.5 }),
]);
}
}6. 健康检查配置
6.1 基本配置
我们可以在导入健康检查模块时进行基本配置:
// src/app.module.ts
import { Module } from '@nestjs/common';
import { TerminusModule } from '@nestjs/terminus';
import { AppController } from './app.controller';
import { AppService } from './app.service';
@Module({
imports: [
TerminusModule.forRoot({
logger: console,
errorLogStyle: 'pretty',
}),
],
controllers: [AppController],
providers: [AppService],
})
export class AppModule {}6.2 自定义响应格式
我们可以自定义健康检查的响应格式:
// src/health/health.controller.ts
import { Controller, Get, Res } from '@nestjs/common';
import { Response } from 'express';
import { HealthCheck, HealthCheckService, HttpHealthIndicator } from '@nestjs/terminus';
@Controller('health')
export class HealthController {
constructor(
private health: HealthCheckService,
private http: HttpHealthIndicator,
) {}
@Get()
@HealthCheck()
async check(@Res() res: Response) {
try {
const result = await this.health.check([
() => this.http.pingCheck('nestjs-docs', 'https://docs.nestjs.com'),
]);
return res.json(result);
} catch (error) {
return res.status(503).json({
status: 'error',
error: error.message,
timestamp: new Date().toISOString(),
});
}
}
}7. 集成到监控系统
7.1 集成到Prometheus
首先,我们需要安装Prometheus客户端:
npm install --save prom-client然后,创建Prometheus指标服务:
// src/metrics/metrics.service.ts
import { Injectable } from '@nestjs/common';
import { register, Counter, Gauge, Histogram, Summary } from 'prom-client';
@Injectable()
export class MetricsService {
private readonly httpRequestsTotal: Counter<string>;
private readonly httpRequestDurationSeconds: Histogram<string>;
private readonly appHealth: Gauge<string>;
constructor() {
// 重置所有指标
register.clear();
// 设置默认标签
register.setDefaultLabels({
app: 'nestjs-application',
});
// 创建指标
this.httpRequestsTotal = new Counter({
name: 'http_requests_total',
help: 'Total number of HTTP requests',
labelNames: ['method', 'route', 'status'],
});
this.httpRequestDurationSeconds = new Histogram({
name: 'http_request_duration_seconds',
help: 'HTTP request duration in seconds',
labelNames: ['method', 'route', 'status'],
buckets: [0.1, 0.5, 1, 2, 5],
});
this.appHealth = new Gauge({
name: 'app_health',
help: 'Application health status',
labelNames: ['service'],
});
}
// 记录HTTP请求
recordHttpRequest(method: string, route: string, status: number, duration: number) {
this.httpRequestsTotal.labels(method, route, status.toString()).inc();
this.httpRequestDurationSeconds.labels(method, route, status.toString()).observe(duration);
}
// 设置应用健康状态
setAppHealth(service: string, status: number) {
this.appHealth.labels(service).set(status);
}
// 获取所有指标
async getMetrics() {
return register.metrics();
}
}创建指标控制器:
// src/metrics/metrics.controller.ts
import { Controller, Get, Res } from '@nestjs/common';
import { Response } from 'express';
import { MetricsService } from './metrics.service';
@Controller('metrics')
export class MetricsController {
constructor(private readonly metricsService: MetricsService) {}
@Get()
async getMetrics(@Res() res: Response) {
const metrics = await this.metricsService.getMetrics();
res.set('Content-Type', 'text/plain');
res.send(metrics);
}
}创建指标模块:
// src/metrics/metrics.module.ts
import { Module } from '@nestjs/common';
import { MetricsController } from './metrics.controller';
import { MetricsService } from './metrics.service';
@Module({
controllers: [MetricsController],
providers: [MetricsService],
exports: [MetricsService],
})
export class MetricsModule {}在应用模块中导入指标模块:
// src/app.module.ts
import { Module } from '@nestjs/common';
import { TerminusModule } from '@nestjs/terminus';
import { MetricsModule } from './metrics/metrics.module';
import { AppController } from './app.controller';
import { AppService } from './app.service';
@Module({
imports: [
TerminusModule,
MetricsModule,
],
controllers: [AppController],
providers: [AppService],
})
export class AppModule {}7.2 集成到Grafana
我们可以使用Grafana来可视化Prometheus指标。首先,我们需要配置Prometheus数据源,然后创建Grafana仪表板。
8. 就绪检查和存活检查
在Kubernetes环境中,我们通常需要两种类型的健康检查:
- 存活检查(Liveness Probe):用于检测应用程序是否还在运行,如果检查失败,Kubernetes会重启容器。
- 就绪检查(Readiness Probe):用于检测应用程序是否准备好接收流量,如果检查失败,Kubernetes会从服务端点中移除容器。
我们可以在健康检查控制器中实现这两种检查:
// src/health/health.controller.ts
import { Controller, Get } from '@nestjs/common';
import { HealthCheck, HealthCheckService, HttpHealthIndicator, TypeOrmHealthIndicator } from '@nestjs/terminus';
@Controller('health')
export class HealthController {
constructor(
private health: HealthCheckService,
private http: HttpHealthIndicator,
private db: TypeOrmHealthIndicator,
) {}
// 存活检查
@Get('liveness')
@HealthCheck()
async liveness() {
return this.health.check([]);
}
// 就绪检查
@Get('readiness')
@HealthCheck()
async readiness() {
return this.health.check([
() => this.db.pingCheck('database'),
() => this.http.pingCheck('nestjs-docs', 'https://docs.nestjs.com'),
]);
}
}然后在Kubernetes配置文件中使用:
apiVersion: apps/v1
kind: Deployment
metadata:
name: nestjs-application
spec:
replicas: 3
selector:
matchLabels:
app: nestjs-application
template:
metadata:
labels:
app: nestjs-application
spec:
containers:
- name: nestjs-application
image: nestjs-application:latest
ports:
- containerPort: 3000
livenessProbe:
httpGet:
path: /health/liveness
port: 3000
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
httpGet:
path: /health/readiness
port: 3000
initialDelaySeconds: 5
periodSeconds: 5实用案例分析
案例1:完整的健康检查系统
需求分析
我们需要实现一个完整的健康检查系统,包括:
- 检查应用程序的基本状态
- 检查数据库连接状态
- 检查Redis连接状态
- 检查外部API服务状态
- 集成到监控系统
- 支持Kubernetes的就绪检查和存活检查
实现方案
- 安装所需依赖:
npm install --save @nestjs/terminus @nestjs/typeorm typeorm mysql2 ioredis prom-client- 创建健康检查模块:
// src/health/health.module.ts
import { Module } from '@nestjs/common';
import { TerminusModule } from '@nestjs/terminus';
import { TypeOrmModule } from '@nestjs/typeorm';
import { HealthController } from './health.controller';
import { CustomHealthIndicator } from './custom.health';
@Module({
imports: [
TerminusModule,
TypeOrmModule.forRoot({
type: 'mysql',
host: process.env.DB_HOST || 'localhost',
port: parseInt(process.env.DB_PORT) || 3306,
username: process.env.DB_USERNAME || 'root',
password: process.env.DB_PASSWORD || 'password',
database: process.env.DB_NAME || 'nestjs',
autoLoadEntities: true,
synchronize: true,
}),
],
controllers: [HealthController],
providers: [CustomHealthIndicator],
})
export class HealthModule {}- 创建健康检查控制器:
// src/health/health.controller.ts
import { Controller, Get } from '@nestjs/common';
import { HealthCheck, HealthCheckService, HttpHealthIndicator, TypeOrmHealthIndicator } from '@nestjs/terminus';
import { CustomHealthIndicator } from './custom.health';
import * as Redis from 'ioredis';
@Controller('health')
export class HealthController {
private readonly redisClient: Redis.Redis;
constructor(
private health: HealthCheckService,
private http: HttpHealthIndicator,
private db: TypeOrmHealthIndicator,
private customHealthIndicator: CustomHealthIndicator,
) {
this.redisClient = new Redis({
host: process.env.REDIS_HOST || 'localhost',
port: parseInt(process.env.REDIS_PORT) || 6379,
});
}
// 存活检查
@Get('liveness')
@HealthCheck()
async liveness() {
return this.health.check([]);
}
// 就绪检查
@Get('readiness')
@HealthCheck()
async readiness() {
return this.health.check([
() => this.db.pingCheck('database'),
() => this.http.pingCheck('nestjs-docs', 'https://docs.nestjs.com'),
async () => {
try {
await this.redisClient.ping();
return {
redis: {
status: 'up',
},
};
} catch (error) {
return {
redis: {
status: 'down',
error: error.message,
},
};
}
},
() => this.customHealthIndicator.isHealthy('custom-service', { threshold: 0.5 }),
]);
}
// 完整健康检查
@Get()
@HealthCheck()
async check() {
return this.health.check([
() => this.db.pingCheck('database'),
() => this.http.pingCheck('nestjs-docs', 'https://docs.nestjs.com'),
async () => {
try {
await this.redisClient.ping();
return {
redis: {
status: 'up',
},
};
} catch (error) {
return {
redis: {
status: 'down',
error: error.message,
},
};
}
},
() => this.customHealthIndicator.isHealthy('custom-service', { threshold: 0.5 }),
]);
}
}- 创建自定义健康检查指标:
// src/health/custom.health.ts
import { Injectable } from '@nestjs/common';
import { HealthIndicator, HealthIndicatorResult, HealthCheckError } from '@nestjs/terminus';
@Injectable()
export class CustomHealthIndicator extends HealthIndicator {
async isHealthy(key: string, options: { threshold: number }): Promise<HealthIndicatorResult> {
// 模拟检查逻辑
const healthStatus = Math.random() > options.threshold;
const result = this.getStatus(key, healthStatus, {
message: healthStatus ? 'Service is healthy' : 'Service is unhealthy',
timestamp: new Date().toISOString(),
});
if (!healthStatus) {
throw new HealthCheckError('Custom health check failed', result);
}
return result;
}
}- 创建指标模块:
// src/metrics/metrics.module.ts
import { Module } from '@nestjs/common';
import { MetricsController } from './metrics.controller';
import { MetricsService } from './metrics.service';
@Module({
controllers: [MetricsController],
providers: [MetricsService],
exports: [MetricsService],
})
export class MetricsModule {}- 创建指标服务:
// src/metrics/metrics.service.ts
import { Injectable } from '@nestjs/common';
import { register, Counter, Gauge, Histogram, Summary } from 'prom-client';
@Injectable()
export class MetricsService {
private readonly httpRequestsTotal: Counter<string>;
private readonly httpRequestDurationSeconds: Histogram<string>;
private readonly appHealth: Gauge<string>;
private readonly dbHealth: Gauge<string>;
private readonly redisHealth: Gauge<string>;
constructor() {
// 重置所有指标
register.clear();
// 设置默认标签
register.setDefaultLabels({
app: 'nestjs-application',
});
// 创建指标
this.httpRequestsTotal = new Counter({
name: 'http_requests_total',
help: 'Total number of HTTP requests',
labelNames: ['method', 'route', 'status'],
});
this.httpRequestDurationSeconds = new Histogram({
name: 'http_request_duration_seconds',
help: 'HTTP request duration in seconds',
labelNames: ['method', 'route', 'status'],
buckets: [0.1, 0.5, 1, 2, 5],
});
this.appHealth = new Gauge({
name: 'app_health',
help: 'Application health status',
labelNames: ['service'],
});
this.dbHealth = new Gauge({
name: 'db_health',
help: 'Database health status',
labelNames: ['database'],
});
this.redisHealth = new Gauge({
name: 'redis_health',
help: 'Redis health status',
labelNames: ['service'],
});
}
// 记录HTTP请求
recordHttpRequest(method: string, route: string, status: number, duration: number) {
this.httpRequestsTotal.labels(method, route, status.toString()).inc();
this.httpRequestDurationSeconds.labels(method, route, status.toString()).observe(duration);
}
// 设置应用健康状态
setAppHealth(service: string, status: number) {
this.appHealth.labels(service).set(status);
}
// 设置数据库健康状态
setDbHealth(database: string, status: number) {
this.dbHealth.labels(database).set(status);
}
// 设置Redis健康状态
setRedisHealth(service: string, status: number) {
this.redisHealth.labels(service).set(status);
}
// 获取所有指标
async getMetrics() {
return register.metrics();
}
}- 创建指标控制器:
// src/metrics/metrics.controller.ts
import { Controller, Get, Res } from '@nestjs/common';
import { Response } from 'express';
import { MetricsService } from './metrics.service';
@Controller('metrics')
export class MetricsController {
constructor(private readonly metricsService: MetricsService) {}
@Get()
async getMetrics(@Res() res: Response) {
const metrics = await this.metricsService.getMetrics();
res.set('Content-Type', 'text/plain');
res.send(metrics);
}
}- 在应用模块中导入:
// src/app.module.ts
import { Module } from '@nestjs/common';
import { HealthModule } from './health/health.module';
import { MetricsModule } from './metrics/metrics.module';
import { AppController } from './app.controller';
import { AppService } from './app.service';
@Module({
imports: [
HealthModule,
MetricsModule,
],
controllers: [AppController],
providers: [AppService],
})
export class AppModule {}- 创建HTTP拦截器记录指标:
// src/common/interceptors/metrics.interceptor.ts
import { Injectable, NestInterceptor, ExecutionContext, CallHandler } from '@nestjs/common';
import { Observable } from 'rxjs';
import { tap } from 'rxjs/operators';
import { MetricsService } from '../../metrics/metrics.service';
@Injectable()
export class MetricsInterceptor implements NestInterceptor {
constructor(private readonly metricsService: MetricsService) {}
intercept(context: ExecutionContext, next: CallHandler): Observable<any> {
const now = Date.now();
const request = context.switchToHttp().getRequest();
const response = context.switchToHttp().getResponse();
const method = request.method;
const route = request.route ? request.route.path : request.url;
return next.handle().pipe(
tap(() => {
const duration = (Date.now() - now) / 1000;
const status = response.statusCode;
this.metricsService.recordHttpRequest(method, route, status, duration);
}),
);
}
}- 在主文件中使用拦截器:
// src/main.ts
import { NestFactory } from '@nestjs/core';
import { AppModule } from './app.module';
import { MetricsInterceptor } from './common/interceptors/metrics.interceptor';
import { MetricsService } from './metrics/metrics.service';
async function bootstrap() {
const app = await NestFactory.create(AppModule);
// 获取指标服务
const metricsService = app.get(MetricsService);
// 使用指标拦截器
app.useGlobalInterceptors(new MetricsInterceptor(metricsService));
await app.listen(3000);
}
bootstrap();常见问题与解决方案
1. 健康检查失败
可能原因:
- 依赖服务不可用(如数据库、Redis等)
- 健康检查配置错误
- 网络问题
解决方案:
- 检查依赖服务的状态
- 检查健康检查配置是否正确
- 检查网络连接是否正常
2. 健康检查响应慢
可能原因:
- 依赖服务响应慢
- 健康检查逻辑复杂
- 并发请求过多
解决方案:
- 优化依赖服务性能
- 简化健康检查逻辑
- 增加健康检查的超时时间
3. 监控系统无法获取指标
可能原因:
- 指标端点配置错误
- Prometheus配置错误
- 网络访问权限问题
解决方案:
- 检查指标端点是否可访问
- 检查Prometheus配置是否正确
- 检查网络访问权限设置
4. Kubernetes就绪检查失败
可能原因:
- 依赖服务未就绪
- 应用程序初始化时间过长
- 健康检查配置错误
解决方案:
- 确保依赖服务已就绪
- 增加就绪检查的初始延迟时间
- 检查健康检查配置是否正确
最佳实践
- 分层健康检查:实现不同级别的健康检查,如基本检查、详细检查等
- 依赖服务检查:检查所有关键依赖服务的状态
- 合理的超时设置:为健康检查设置合理的超时时间
- 监控集成:将健康检查集成到监控系统中
- Kubernetes适配:实现符合Kubernetes要求的就绪检查和存活检查
- 错误处理:为健康检查添加适当的错误处理
- 日志记录:为健康检查添加详细的日志记录
- 性能优化:确保健康检查不会影响应用程序的性能
代码优化建议
- 使用配置管理:将健康检查的配置放到配置文件中
- 实现缓存:对健康检查结果进行缓存,减少重复检查
- 使用异步检查:使用异步健康检查,提高并发性能
- 添加指标标签:为健康指标添加更多标签,提高可观测性
- 实现告警:当健康检查失败时,触发告警通知
总结
NestJS的健康检查模块提供了一种简洁、高效的方式来监控应用程序的状态。通过本文的学习,你应该已经掌握了:
- 如何安装和配置健康检查模块
- 如何使用内置的健康检查指标
- 如何创建自定义的健康检查指标
- 如何集成健康检查到监控系统
- 如何实现Kubernetes的就绪检查和存活检查
- 健康检查的最佳实践和常见问题解决方案
健康检查是现代应用程序的重要组成部分,它可以帮助我们及时发现和解决应用程序的问题,提高应用程序的可靠性和可用性。合理使用NestJS的健康检查功能,可以让你的应用程序更加健壮和可维护。
互动问答
以下哪个是NestJS健康检查模块的正确安装命令?
A.npm install --save @nestjs/health
B.npm install --save @nestjs/terminus
C.npm install --save health-check
D.npm install --save terminus如何在NestJS中实现数据库健康检查?
如何创建自定义健康检查指标?
什么是Kubernetes的就绪检查和存活检查?它们有什么区别?
如何将健康检查集成到Prometheus监控系统?