-
Notifications
You must be signed in to change notification settings - Fork 617
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Please enable better error handling or auto handling for the case of normal message callback followed by "RST_STREAM with code 0" error status callback #2004
Comments
Since it works with another client, you may be encountering a bug in grpc-js. Would you mind running your client with the environment variables |
I seem to have a similar issue, here are the requested logs: As far as I can tell, this seems to occur when the server (in my case Tonic) tries to short circuit and return the full response in the initial response headers frame, and does not put the grpc status details in a trailer frame.
|
Apologies for the possible red herring. After further investigation it appears that my situation was caused by a faulty middleware not properly propagating end of stream hints. After fixing, grpc-js seems to work fine in the trailers-only situation. Logs from the server-side (with faulty middleware):
With proper middleware:
|
I think my team has come across a similar issue, "Error: 13 INTERNAL: Received RST_STREAM with code 0" which is confounding us |
Could this possibly be related to this: #2316. |
That error occurs when a client receives an RST_STREAM to end the stream without receiving trailers from the server. That is a protocol error if it occurs, which is why it causes an error to be surfaced. I don't see any way that the change you linked would impact the handling of the relevant events that result in that error. As I asked the previous person, can you run your client with the environment variables |
Ok. The reason I thought they may be related is that we are also intermittently seeing dropped data (albiet, we don't manage the resource server so can't say with 100% certainty that it isn't the resource that has a problem). Given what you said, the dropped data is probably unrelated to the RST_STREAM error which I think we can just retry on. Anyway, I'll run with that env tomorrow once my team is back online. |
How long have you been seeing this intermittently dropped data? I only published that change 4 hours ago, and it's supposed to be a fix for that problem. |
Ok, so we were using 1.8.0 and the dropped data was definitely fixed by bumping to 1.8.2 or rolling back to 1.7.3. We started experiencing the problem after an upgrade to 1.8.0 sometime last month. |
@murgatroid99 Tested on: The server running on node 16.18.0 and uses grpc-js 1.8.11 Server error log: (I am not seeing any grpc error here)
Server code (nestjs): // Imports ...
async function bootstrap() {
const port = process.env.PORT || '5000'
const address = '0.0.0.0'
const journalingProtoPath = join(
process.cwd(),
'/node_modules/path/to/protofile.proto'
)
const microserviceOptions: MicroserviceOptions = {
transport: Transport.GRPC,
options: {
package: [JOURNALING_PACKAGE_NAME],
protoPath: [journalingProtoPath],
url: `${address}:${port}`,
},
}
const app = await NestFactory.createMicroservice(AppModule, {
...microserviceOptions,
})
app.listen().then(() => {
instance.info(
`Started GRPC server on ${microserviceOptions.options.url}`
)
})
}
bootstrap() exceptionsFilter.ts import { throwError } from 'rxjs'
import { status } from '@grpc/grpc-js'
import { BaseRpcExceptionFilter, RpcException } from '@nestjs/microservices'
import {
Catch,
Inject,
Logger,
ArgumentsHost,
LoggerService,
RpcExceptionFilter,
} from '@nestjs/common'
@Catch()
export class AllExceptionsFilter
extends BaseRpcExceptionFilter
implements RpcExceptionFilter
{
constructor(@Inject(Logger) private readonly logger: LoggerService) {
super()
}
catch(exception: any, host: ArgumentsHost) {
const [req] = host.getArgs()
const code =
exception.code !== undefined ? exception.code : status.INTERNAL
const message = exception.details
? String(exception.details)
: exception.message
const stack = exception.stack
const userId = req?.user?.id ?? 'NONE-JOURNALING'
this.logger.error({
userId,
message,
stack,
})
return throwError(
new RpcException({
code,
stack,
message,
}).getError()
)
}
} Client code: // Imports ...
type CredentialsReturnType = {
callCreds: grpc.CallCredentials
channelCredentials: grpc.ChannelCredentials
}
export class GrpcClient {
private static email_notification: EmailNotificationServiceClient
private static image_generator: ImageGeneratorServiceClient
private static analytics: AnalyticsServiceClient
private static question: QuestionServiceClient
private static goal: GoalServiceClient
private static questionnaire: QuestionnaireServiceClient
private static journaling: JournalingServiceClient
private static blog: BlogServiceClient
private static wellbeing_techniques: WellbeingTechniquesServiceClient
private static chat: ChatServiceClient
private static follow_on_question: FollowOnQuestionServiceClient
async initialize() {
const packageNames = Object.values(ProtoPackage)
for await (const packageName of packageNames) {
const serviceUrl = config.services.servicesUrl[packageName]
const serviceName = serviceNameMapping[packageName]
const protoPath = path.join(
process.cwd(),
config.services.protoPaths[packageName]
)
const PROTO_PATH = path.resolve(__dirname, protoPath)
const { channelCredentials, callCreds } =
await GrpcClient.getCredentials(serviceUrl)
var packageDefinition = protoLoader.loadSync(PROTO_PATH, {
keepCase: false,
longs: Number,
enums: Number,
defaults: true,
oneofs: true
})
const packageDef = grpc.loadPackageDefinition(packageDefinition),
routeguide = packageDef[packageName],
service = get(routeguide, serviceName),
client = new service(
serviceUrl,
// grpc.credentials.createInsecure(), // Uncomment to work on local
grpc.credentials.combineChannelCredentials(
channelCredentials,
callCreds
), // Comment to work on local
{
"grpc.service_config": JSON.stringify({
methodConfig: [
{
name: [
{
service: `${packageName}.${serviceName}`
}
],
retryPolicy: {
maxAttempts: 2,
initialBackoff: "1s",
maxBackoff: "4s",
backoffMultiplier: 2,
retryableStatusCodes: ["UNAVAILABLE"]
}
}
]
})
} as grpc.ChannelOptions
)
GrpcClient[packageName] = client
logger.info(`Created GrpcClient for ${packageName}`)
}
}
private static async getCredentials(
serviceUrl: string
): Promise<CredentialsReturnType> {
logger.info("get credentials for grpcClient")
const target = `https://${serviceUrl}`
const idTokenClient = await new GoogleAuth().getIdTokenClient(target)
const channelCredentials = grpc.credentials.createSsl()
const callCreds =
grpc.credentials.createFromGoogleCredential(idTokenClient)
return {
channelCredentials,
callCreds
}
}
public static async request(
method: string,
payload: any,
packageName: ProtoPackage
): Promise<any> {
try {
if (!GrpcClient[packageName]) {
throw new Error(`GrpcClient for ${packageName} not initialized`)
}
return new Promise((resolve, reject) =>
get(GrpcClient, packageName)[method](
{ ...payload },
(err: any, response: any) => {
if (err) reject(err)
resolve(response)
}
)
)
} catch (err) {
throw err
}
}
} Client logs: RST_STREAM with code 0
What I want all the time.
I noticed that when I get "Received RST_STREAM with code 0" there are no "Received server trailers" in the logs. |
@i7N3 Have you checked whether this happens when running the client and server on the same machine. If it does, a complete reproduction would really help me track down the cause of the bug. If it does not happen, then the most likely cause is some intervening proxy, so I would recommend opening a support ticket for whatever system you are running the server on, with a reference to your comment. |
@murgatroid99 This does not happen when I run the client and server on the same machine. I will open a support ticket and write back about what will come of it later. |
BTW I am using GCP - Cloud Run. Maybe someone has experience with this error on the same stack. |
Facing the same issue . Is there any update on this ? I am using nodejs version 16.18.0 and electronjs version 24.3.0 and I am running both grpc server and client on same machine in local. |
From me, unfortunately not. refs: Can you provide minimal reproduction repo? |
Experiencing the same issue with |
Enabling |
I can confirm that enabling |
Is your feature request related to a problem? Please describe.
I wrote a simple unary rpc client over to an internal rpc server that I don't have control. It failed with "Error: 13 INTERNAL: Received RST_STREAM with code 0". I have identified the root cause. The server sent the expected message which was received in the onReceiveMessage callback of the grpc-js client.ts. However somehow the grpc-js client received the unexpected error, "RST_STREAM with code 0" instead of OK status. It appears the server or its layers on the server side unexpectedly closed the stream after the correct response message was sent to the grps-js client. The responseMessage variable still holds the earlier received message by the time onReceiveStatus callback is called with the error status in the grpc-js layer.
It appears, from my client code, there is no way to access the received message when the follow-up status callback has received a status other than status.OK. (The current implementation v1.4.5 of "onReceiveStatus" in makeUnaryRequest, makeClientStreamRequest in @grpc/grpc-js/src/client.ts)
Describe the solution you'd like
When an error happens, could we callback with not just error but also response message?
FROM:
TO:
This enables user to write a better error handling:
Describe alternatives you've considered
For the same server, I have wrote a scala client with the sbt-akka-grpc. It worked out of the box and I didn't have to deal with the case of the correct-message followed by RST_STREAM with code 0 error.
Hence another option would be for grpc-js to treat this special case as a normal OK case and issue a message to the customer instead of giving the error.
Additional context
Add any other context about the feature request here.
The text was updated successfully, but these errors were encountered: