Jenkins-pipeline语法之错误处理详解(文末有干货)

文章发布较早，内容可能过时，阅读注意甄别。

在编写 pipeline 脚本时，合理规划并运用每个步骤的错误处理，是非常关键的，因为这些错误信息能够在流水线出错的时候，辅助我们快速定位导致错误的原因，从而判别出问题的真正原因。

Jenkins 的错误处理形式有好几种，因此这里将这几种方式都进行一下罗列，给你的流水线实践提供一个参考。

这里先围绕几个关键字及其对应的特性进行介绍，最后再介绍一个个人实践当中常用的一个方案。

# error 关键字

error() 方法用于显式地在管道中抛出一个错误（错误信息为自定义内容）。它会立即从出现错误的地方终止，并将整个构建标记为失败。适用场景：可在校验不符合预期的地方使用该方法。

import java.time.LocalDateTime
import java.time.format.DateTimeFormatter

// 获取当前时间戳
def getTimeStamp(){
  LocalDateTime.now().format(DateTimeFormatter.ofPattern('yyyyMMddHHmmss')) // 20210817154700
}

// 随机错误
def randomError(stageName){
  def timeStamp = getTimeStamp()
  def lastDigit = timeStamp[-1] as int
  sleep(new Random().nextInt(5)) // 随机等待0-5秒
  // 奇数时报错
  if (lastDigit % 2 != 0) {
	  println(timeStamp)
  }else{
	  error("This is the ${stageName} stage error message")
  }
}

pipeline {
  agent any
  stages {
	  stage("one") {
		  steps {
			  script{
				  echo "===one==="
				  randomError("one")
			  }
		  }
	  }
	  stage("two") {
		  steps {
			  script{
				  echo "===two==="
				  randomError("two")
			  }
		  }
	  }
	  stage("three") {
		  steps {
			  script{
				  echo "===three==="
				  randomError("three")
			  }
		  }
	  }
  }
}

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50

在这个例子中，开头定义了一个获取当前时间戳，并根据号来判断是否正常的一个例子，下边三个 stage 调用该方法，用以验证随机步骤出现问题，然后退出构建并将错误抛出。

# catchError 关键字

catchError 是一个步骤，它可以捕获在其步骤范围内发生的错误，并允许管道继续执行。适用场景：适合那种即便出错也不影响整个流水线执行下去的场景。

import java.time.LocalDateTime
import java.time.format.DateTimeFormatter

// 获取当前时间戳
def getTimeStamp(){
  LocalDateTime.now().format(DateTimeFormatter.ofPattern('yyyyMMddHHmmss')) // 20210817154700
}

// 随机错误
def randomError(stageName){
  def timeStamp = getTimeStamp()
  def lastDigit = timeStamp[-1] as int
  println("当前时间: " + timeStamp)
  sleep(new Random().nextInt(5)) // 随机等待0-5秒
  // 奇数时报错
  if (lastDigit % 2 != 0) {
	  println(timeStamp)
  }else{
	  sh "exit 1"
  }
}

pipeline {
  agent any
  stages {
	  stage("one") {
		  steps {
			  script{
				  echo "===one==="
				  catchError(buildResult: 'SUCCESS', stageResult: 'FAILURE'){
					  script {
						  randomError("one")
					  }
				  }
				  echo 'this is after catchError one block'
			  }
		  }
	  }
	  stage("two") {
		  steps {
			  script{
				  echo "===two==="
				  catchError(buildResult: 'SUCCESS', stageResult: 'FAILURE'){
					  script {
						  randomError("two")
					  }
				  }
				  echo 'this is after catchError two block'
			  }
		  }
	  }
	  stage("three") {
		  steps {
			  script{
				  echo "===three==="
				  catchError(buildResult: 'SUCCESS', stageResult: 'FAILURE'){
					  script {
						  randomError("three")
					  }
				  }
				  echo 'this is after catchError three block'
			  }
		  }
	  }
  }
}

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66

在上面的例子中，即使走到了 exit 1 的逻辑，管道仍会继续向下执行，最终当前阶段会被标记为失败，但整个构建结果仍是成功。

# try -catch

try-catch 是 Groovy 语言本身的错误处理机制，可以在脚本块中使用。它允许你捕获和处理异常。适用场景：可用在 catch 整个区块儿的错误，而不用像 error 关键字那样，需要每个步骤都判断是否正常执行。

import java.time.LocalDateTime
import java.time.format.DateTimeFormatter

// 获取当前时间戳
def getTimeStamp(){
  LocalDateTime.now().format(DateTimeFormatter.ofPattern('yyyyMMddHHmmss')) // 20210817154700
}

// 随机错误
def randomError(stageName){
  def timeStamp = getTimeStamp()
  def lastDigit = timeStamp[-1] as int
  println("当前时间: " + timeStamp)
  sleep(new Random().nextInt(5)) // 随机等待0-5秒
  // 奇数时报错
  if (lastDigit % 2 != 0) {
	  println(timeStamp)
  }else{
	  sh "exit 1"
  }
}

pipeline {
  agent any
  stages {
	  stage("one") {
		  steps {
			  script{
				  echo "===one==="
				  try {
					  randomError("one")
				  } catch (Exception e) {
					  echo "Caught an exception: ${e}"
				  }
				  echo 'this is after catch one block'
			  }
		  }
	  }
	  stage("two") {
		  steps {
			  script{
				  echo "===two==="
				  try {
					  randomError("two")
				  } catch (Exception e) {
					  echo "Caught an exception: ${e}"
				  }
				  echo 'this is after catch two block'
			  }
		  }
	  }
	  stage("three") {
		  steps {
			  script{
				  echo "===three==="
				  try {
					  randomError("three")
				  } catch (Exception e) {
					  echo "Caught an exception: ${e}"
				  }
				  echo 'this is after catch three block'
			  }
		  }
	  }
  }
}

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66

在上面的例子中，sh 'exit 1' 发生的错误会被捕获，并输出异常信息，但管道继续执行。

如果你想在异常出现的时候，退出整个构建，可使用：

try {
  sh "exit 1"
} catch (Exception e) {
  throw(e)
}

1
2
3
4
5

throw(e) 表示将异常向外抛出，并中止整个构建流程。

如果你想在捕获异常之外，还有步骤想要执行，可使用：

try {
  sh "exit 1"
} catch (Exception e) {
  throw(e)
} finally {
	echo "finally step"
}

1
2
3
4
5
6
7

需要注意一个点，try-catch 所捕获的错误，通常是代码执行的堆栈错误信息，比较冗长，容易干扰视线，因此一般不建议返回此错误，可返回一个自定义错误信息。

# unstable 关键字

unstable() 方法允许你将构建标记为不稳定（UNSTABLE），这意味着存在一些警告或次要问题，但构建并未失败。一般情况下，这个关键字应该很少会用到。

pipeline {
	agent any
	stages {
	   stage('Example') {
		   steps {
			   script {
				   if (someMinorIssue) {
					   unstable("This is a warning message")
				   }
			   }
		   }
	   }
	}
}

1
2
3
4
5
6
7
8
9
10
11
12
13
14

申明

原创文章eryajf，未经授权，严禁转载，侵权必究！此乃文中随机水印，敬请读者谅解。

# 我的实践方案

通常在实际使用中，我们希望一旦某个命令，或者某个步骤执行有异常，即抛出错误，并且终止流水线。

我建议采用 try-catch 结合 error 的方式，既能帮我们捕获到整个区块儿异常，还能将自定义的错误信息抛出，且阻断流水线的继续执行，能够满足我们实际应用场景的需求。

pipeline {
    agent any
    stages {
        stage("one") {
            steps {
                script{
                    echo "===one==="
                    try {
                        echo "aaa"
                        sh "exit 1"
                        echo "bbb"
                    } catch (Exception e) {
                        env.FAILURE_MSG = "error message one"
                        error(FAILURE_MSG)
                    }
                }
            }
        }
        stage("two") {
            steps {
                script{
                    echo "===two==="
                    try {
                        echo "bbb"
                    } catch (Exception e) {
                        env.FAILURE_MSG = "error message two"
                        error(FAILURE_MSG)
                    }
                }
            }
        }
    }
    post {
        always {
            sh 'printenv'
        }
        success {
            echo "this is success"
        }
        failure {
            script{
                currentBuild.description = FAILURE_MSG // 将整个构建过程中出现的错误信息设置到构建描述中
            }
        }
        cleanup {
            cleanWs()
        }
    }
}

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49

划重点

前边提到过的所谓 try-catch 区块儿，代表的意思是：用该关键字包裹起来的流水线代码。因此，我认为，你所有的 stage 步骤下的内容，都应用通过 try-catch 进行包裹，从而自动达到其中任意一行代码出错，即可被捕获到。(我也曾见到过有不少人写的 pipeline 代码，几乎对每行代码的返回值都进行判断，这显然是不必要的)。

另外，上边这段代码中，最后的 post 步骤还有三个地方值得一说：

第一：在 always 打印当次流水线中的环境变量，这个是很早前就讲过的，不再赘述。
第二：因为前边每个步骤如果出错，那么自定义的错误信息都赋值给了 FAILURE_MSG 变量，最后在失败的时候，再将该变量赋值给 currentBuild.description，从而让错误信息输出在该次构建的 description 中，如果你的 Jenkins 被其他平台封装，也是比较容易通过该固定字段，拿到流水线失败时的错误原因呢。这个地方比较细节，一般人我不告诉他。
第三：流水线中，post 步骤中，常用的 always，success，failure 三个步骤的执行顺序是自上而下的，你并不能通过改变方法放置的位置而改变执行的时机。此时，可通过定义：cleanup 步骤，来定义清理工作目录的方法。之前我的观念中，是一个工作空间不清理主义者，但有两个现实的转变，使我也转变了这一观念。我之所以坚持不清理，是出于保留构建现场的角度来考虑的，有时候流水线构建失败，我们可能想在当时失败的现场(工作空间)里，复现，定位，排查。但是现在一方面整个 pipeline 针对错误处理的设计已经很完善，通常定位构建异常的原因也变得简单起来，另一方面，Jenkins 经常会在某次构建失败之后，导致该工作目录的 git 出现脏数据，后来所有触发的构建都因拉取流水线脚本失败而失败，只能通过运维人员手动清空工作目录来解决该问题。我感到得不偿失，因此，我现在坚定认为，每个流水线，都应该在最后的步骤，配置 cleanup。

#Jenkins

上次更新: 2025/03/06, 21:25:58

← CentOS通过yum快速安装Jenkins Jenkins常用插件汇总以及简单介绍→